PuppeteerPoolOptions
Properties
maxOpenPagesPerInstance
Type: number
= 50
Maximum number of open pages (i.e. tabs) per browser. When this limit is reached, new pages are loaded in a new browser instance.
retireInstanceAfterRequestCount
Type: number
= 100
Maximum number of requests that can be processed by a single browser instance. After the limit is reached, the browser is retired and new requests are handled by a new browser instance.
puppeteerOperationTimeoutSecs
Type: number
= 15
All browser management operations such as launching a new browser, opening a new page or closing a page will timeout after the set number of seconds and the connected browser will be retired.
instanceKillerIntervalSecs
Type: number
= 60
Indicates how often are the open Puppeteer instances checked whether they can be closed.
killInstanceAfterSecs
Type: number
= 300
When Puppeteer instance reaches the retireInstanceAfterRequestCount
limit then it is considered retired and no more tabs will be opened. After the
last tab is closed the whole browser is closed too. This parameter defines a time limit between the last tab was opened and before the browser is
closed even if there are pending open tabs.
launchPuppeteerFunction
Type: LaunchPuppeteerFunction
Overrides the default function to launch a new Puppeteer instance. It must return a promise resolving to
Browser
instance.
The function receives one parameter that includes launch options, such as proxyUrl
and other options generated by the PuppeteerPool
instance. You
can extend or update this options object, but you must not ignore it.
Correct:
async function launchPuppeteerFunction(options) {
const newOpts = {
...options,
foo: 'bar',
};
// do some other things
return Apify.launchPuppeteer(newOpts);
}
Incorrect:
async function launchPuppeteerFunction() {
const opts = {
foo: 'bar',
};
// Because we ignored the options, correct parameters
// will not make it to the browser. Eg. this prevents
// proxyConfiguration from working correctly.
return Apify.launchPuppeteer(opts);
}
launchPuppeteerOptions
Type: LaunchPuppeteerOptions
Options used by Apify.launchPuppeteer()
to start new Puppeteer instances.
recycleDiskCache
Type: boolean
= false
Enables recycling of disk cache directories by Chrome instances. When a browser instance is closed, its disk cache directory is not deleted but it's used by a newly opened browser instance. This is useful to reduce amount of data that needs to be downloaded to speed up crawling and reduce proxy usage. Note that the new browser starts with empty cookies, local storage etc. so this setting doesn't affect anonymity of your crawler.
Beware that the disk cache directories can consume a lot of disk space. To limit the space consumed, you can pass the --disk-cache-size=X
argument
to launchPuppeteer
args
, where X
is the approximate maximum number of bytes for disk cache.
Do not use the recycleDiskCache
setting together with --disk-cache-dir
argument in launchPuppeteer
args
, the behavior is undefined.
useIncognitoPages
Type: boolean
With this option selected, all pages will be opened in a new incognito browser context, which means that they will not share cookies or cache and their resources will not be throttled by one another.
sessionPool
Type: SessionPool
A pool of Session instances.
proxyConfiguration
Type: ProxyConfiguration
If set, PuppeteerPool
will be configured for all connections to use Apify Proxy or your own Proxy URLs provided and
rotated according to the configuration. For more information, see the documentation.