Using man-in-the-middle proxy to intercept requests in Puppeteer
Sometimes you may need to intercept (or maybe block) requests in headless Chrome / Puppeteer, but page.setRequestInterception()  is not 100% reliable when the request is started in a new window.
One possible way to intercept these requests is to use a man-in-the-middle (MITM) proxy, i.e. a proxy server that can intercept and modify HTTP requests, even those over HTTPS. In this example, we're going to use https://github.com/joeferner/node-http-mitm-proxy, since it has all the tools that we need.
First we set up the MITM proxy:
const { promisify } = require('util');
const { exec } = require('child_process');
const Proxy = require('http-mitm-proxy');
const Promise = require('bluebird');
const execPromise = promisify(exec);
const wait = (timeout) => new Promise((resolve) => setTimeout(resolve, timeout));
const setupProxy = async (port) => {
    // Setup chromium certs directory
    // WARNING: this only works in debian docker images
    // modify it for any other use cases or local usage.
    await execPromise('mkdir -p $HOME/.pki/nssdb');
    await execPromise('certutil -d sql:$HOME/.pki/nssdb -N');
    const proxy = Proxy();
    proxy.use(Proxy.wildcard);
    proxy.use(Proxy.gunzip);
    return new Promise((resolve, reject) => {
        proxy.listen({ port, silent: true }, (err) => {
            if (err) return reject(err);
            // Add CA certificate to chromium and return initialize proxy object
            execPromise('certutil -d sql:$HOME/.pki/nssdb -A -t "C,," -n mitm-ca -i ./.http-mitm-proxy/certs/ca.pem')
                .then(() => resolve(proxy))
                .catch(reject);
        });
    });
};
Then we'll need a Docker image that has the certutil utility. Here is an example of a Dockerfile that can create such an image and is based on the apify/actor-node-chrome image that contains Puppeteer.
Now we need to specify how the proxy shall handle the intercepted requests:
// Setup blocking of requests in proxy
const proxyPort = 8000;
const proxy = setupProxy(proxyPort);
proxy.onRequest((context, callback) => {
    if (blockRequests) {
        const request = context.clientToProxyRequest;
        // Log out blocked requests
        console.log('Blocked request:', request.headers.host, request.url);
        // Close the connection with custom content
        context.proxyToClientResponse.end('Blocked');
        return;
    }
    return callback();
});
The final step is to let Puppeteer use the local proxy:
// Launch puppeteer with local proxy
const browser = await puppeteer.launch({
    args: ['--no-sandbox', `--proxy-server=localhost:${proxyPort}`],
});
And we're done! By adjusting the blockRequests variable, you can allow or block any request initiated through Puppeteer.
Here is a GitHub repository with a full example and all necessary files: https://github.com/apify/actor-example-proxy-intercept-request
If you have any questions, feel free to contact us in the chat.
Happy intercepting!