externalRequestQueue
Hierarchy
- RequestProvider
- RequestQueue
Index
Constructors
Properties
Methods
Constructors
externalconstructor
Parameters
externaloptions: RequestProviderOptions
externaloptionalconfig: Configuration
Returns RequestQueue
Properties
externalinheritedassumedHandledCount
externalinheritedassumedTotalCount
externalinheritedclient
externalinheritedclientKey
externalreadonlyinheritedconfig
externalinheritedid
externalinheritedinternalTimeoutMillis
externalinheritedlog
externaloptionalinheritedname
externalinheritedrequestLockSecs
externalinheritedtimeoutSecs
Methods
externaladdRequest
Parameters
externalrequestLike: Source
externaloptionaloptions: RequestQueueOperationOptions
Returns Promise<RequestQueueOperationInfo>
externaladdRequests
Parameters
externalrequestsLike: Source[]
externaloptionaloptions: RequestQueueOperationOptions
Returns Promise<BatchAddRequestsResult>
externalinheritedaddRequestsBatched
Adds requests to the queue in batches. By default, it will resolve after the initial batch is added, and continue adding the rest in the background. You can configure the batch size via
batchSize
option and the sleep time in between the batches viawaitBetweenBatchesMillis
. If you want to wait for all batches to be added to the queue, you can use thewaitForAllRequestsToBeAdded
promise you get in the response object.Parameters
externalrequests: (string | Source)[]
The requests to add
externaloptionaloptions: AddRequestsBatchedOptions
Options for the request queue
Returns Promise<AddRequestsBatchedResult>
externalinheriteddrop
Removes the queue either from the Apify Cloud storage or from the local database, depending on the mode of operation.
Returns Promise<void>
externalfetchNextRequest
Returns Promise<null | Request<T>>
externalinheritedgetInfo
Returns an object containing general information about the request queue.
The function returns the same object as the Apify API Client's getQueue function, which in turn calls the Get request queue API endpoint.
Example:
{
id: "WkzbQMuFYuamGv3YF",
name: "my-queue",
userId: "wRsJZtadYvn4mBZmm",
createdAt: new Date("2015-12-12T07:34:14.202Z"),
modifiedAt: new Date("2015-12-13T08:36:13.202Z"),
accessedAt: new Date("2015-12-14T08:36:13.202Z"),
totalRequestCount: 25,
handledRequestCount: 5,
pendingRequestCount: 20,
}Returns Promise<undefined | RequestQueueInfo>
externalinheritedgetRequest
Gets the request from the queue specified by ID.
Parameters
externalid: string
ID of the request.
Returns Promise<null | Request<T>>
Returns the request object, or
null
if it was not found.
externalinheritedgetTotalCount
Returns an offline approximation of the total number of requests in the queue (i.e. pending + handled).
Survives restarts and actor migrations.
Returns number
externalinheritedhandledCount
Returns the number of handled requests.
This function is just a convenient shortcut for:
const { handledRequestCount } = await queue.getInfo();
Returns Promise<number>
externalinheritedisEmpty
Resolves to
true
if the next call to RequestQueue.fetchNextRequest would returnnull
, otherwise it resolves tofalse
. Note that even if the queue is empty, there might be some pending requests currently being processed. If you need to ensure that there is no activity in the queue, use RequestQueue.isFinished.Returns Promise<boolean>
externalisFinished
Returns Promise<boolean>
externalmarkRequestHandled
Parameters
externalrequest: Request<Dictionary>
Returns Promise<null | RequestQueueOperationInfo>
externalreclaimRequest
Parameters
externalrest...args: [request: Request<Dictionary>, options?: RequestQueueOperationOptions]
Returns Promise<null | RequestQueueOperationInfo>
staticexternalopen
Parameters
externalrest...args: [queueIdOrName?: null | string, options?: StorageManagerOptions]
Returns Promise<RequestQueue>
Represents a queue of URLs to crawl, which is used for deep crawling of websites where you start with several URLs and then recursively follow links to other pages. The data structure supports both breadth-first and depth-first crawling orders.
Each URL is represented using an instance of the Request class. The queue can only contain unique URLs. More precisely, it can only contain Request instances with distinct
uniqueKey
properties. By default,uniqueKey
is generated from the URL, but it can also be overridden. To add a single URL multiple times to the queue, corresponding Request objects will need to have differentuniqueKey
properties.Do not instantiate this class directly, use the RequestQueue.open function instead.
RequestQueue
is used by BasicCrawler, CheerioCrawler, PuppeteerCrawler and PlaywrightCrawler as a source of URLs to crawl. Unlike RequestList,RequestQueue
supports dynamic adding and removing of requests. On the other hand, the queue is not optimized for operations that add or remove a large number of URLs in a batch.Example usage: