Request queue
Queue URLs for an actor to visit in its run. Learn how to share your queues between actor runs. Access and manage request queues from Apify Console or via API.
Request queues enable you to enqueue and retrieve requests such as URLs with an HTTP method and other parameters. They are useful not only in web crawling, but anywhere you need to process a high number of URLs and enqueue new links.
Request queue storage supports both breadth-first and depth-first crawling orders, as well as custom data attributes. It allows you to query whether specific URLs were already found, push new URLs to the queue and fetch the next URLs to process.
Named request queues are retained indefinitely.
Unnamed request queues expire after 7 days unless otherwise specified.
Learn about named and unnamed queues.
Basic usage
There are five ways to access your request queues:
- Apify Console - provides an easy-to-understand interface [details].
- Apify SDK - when building your own Apify actor [details].
- JavaScript API client - to access your request queues from any Node.js application [details].
- Python API client - to access your request queues from any Python application [details].
- Apify API - for accessing your request queues programmatically [details].
Apify Console
In Apify Console, you can view your request queues in the Storage section under the Request queues tab.
Only named request queues are displayed by default. Select the Include unnamed request queues checkbox to display all of your queues.
To view a request queue, click on its Queue ID.
Under the Settings tab, you can update the queue's name (and, in turn, its
retention period) and access rights.
Click on the API
button to view and test a queue's API endpoints.
Apify SDK
If you are building an Apify actor, you will be using the Apify SDK.
In the Apify SDK, the request queue is represented by the
RequestQueue
class.
You can use the RequestQueue
class to specify whether your data is stored locally or in the Apify cloud and enqueue new URLs.
Each actor run is associated with the default request queue, which is created for the actor run when the first request is added to it. Typically, it is used to store URLs to crawl in the specific actor run, however its usage is optional.
You can also create named queues which can be shared between actors or between actor runs.
If you are storing your data locally, you can find your request queue at the following location.
{APIFY_LOCAL_STORAGE_DIR}/request_queues/{QUEUE_ID}/{ID}.json
The default request queue's ID is default. Each request in the queue is stored as a separate JSON file, where {ID} is a request ID.
To open a request queue, use the Actor.openRequestQueue()
method.
// Import the Apify SDK into your project
import { Actor } from 'apify';
await Actor.init();
// ...
// Open the default request queue associated with
// the actor run
const queue = await Actor.openRequestQueue();
// Open the 'my-queue' request queue
const queueWithName = await Actor.openRequestQueue('my-queue');
// ...
await Actor.exit();
Once a queue is open, you can manage it using the following methods. See the RequestQueue
class's API reference for the full list.
// Import the Apify SDK into your project
import { Actor } from 'apify';
await Actor.init();
// ...
const queue = await Actor.openRequestQueue();
// Enqueue requests
await queue.addRequests([{ url: 'http://example.com/aaa' }]);
await queue.addRequests([
'http://example.com/foo',
'http://example.com/bar',
], { forefront: true });
// Get the next request from queue
const request1 = await queue.fetchNextRequest();
const request2 = await queue.fetchNextRequest();
// Get a specific request
const specificRequest = await queue.getRequest('shi6Nh3bfs3');
// Reclaim a failed request back to the queue
// and crawl it again
await queue.reclaimRequest(request2);
// Remove a queue
await queue.drop();
// ...
await Actor.exit();
See the SDK documentation and the RequestQueue
class's API reference for details on managing your request queues with the Apify SDK.
JavaScript API client
Apify's JavaScript API client (apify-client
) allows you to access your request queues from any Node.js application, whether it is running on the Apify platform or elsewhere.
After importing and initiating the client, you can save each request queue to a variable for easier access.
const myQueueClient = apifyClient.requestQueue('jane-doe/my-request-queue');
You can then use that variable to access the request queue's items and manage it.
See the JavaScript API client documentation for help with setup and more details.
Python API client
Apify's Python API client (apify-client
) allows you to access your request queues from any Python application, whether it is running on the Apify platform or elsewhere.
After importing and initiating the client, you can save each request queue to a variable for easier access.
my_queue_client = apify_client.request_queue('jane-doe/my-request-queue')
You can then use that variable to access the request queue's items and manage it.
See the Python API client documentation for help with setup and more details.
Apify API
The Apify API allows you to access your request queues programmatically using HTTP requests.
If you are accessing your datasets using the username~store-name store ID format, you will need to use your secret API token. You can find the token (and your user ID) on the Integrations page of your Apify account.
When providing your API authentication token, we recommend using the request's
Authorization
header, rather than the URL. (More info).
To get a list of your request queues, send a GET request to the Get list of request queues endpoint.
https://api.apify.com/v2/request-queues
To get information about a request queue such as its creation time and item count, send a GET request to the Get request queue endpoint.
https://api.apify.com/v2/request-queues/{QUEUE_ID}
To get a request from a queue, send a GET request to the Get request endpoint.
https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests/{REQUEST_ID}
To add a request to a queue, send a POST request with the request to be added as a JSON object in the request's payload to the Add request endpoint.
https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests
Example payload:
{
"uniqueKey": "http://example.com",
"url": "http://example.com",
"method": "GET"
}
To update a request in a queue, send a PUT request with the request to update as a JSON object in the request's payload to the Update request endpoint. In the payload, specify the request's ID and add the information you want to update.
https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests/{REQUEST_ID}
Example payload:
{
"id": "dnjkDMKLmdlkmlkmld",
"uniqueKey": "http://example.com",
"url": "http://example.com",
"method": "GET"
}
When adding or updating requests, you can optionally provide a
clientKey
parameter to your request. It must be a string between 1 and 32 characters in length. This identifier is used to determine whether the queue was accessed by multiple clients. IfclientKey
is not provided, the system considers this API call to come from a new client. See thehadMultipleClients
field returned by theGet head
operation for details.Example:
client-abc
See the API documentation for a detailed breakdown of each API endpoint.
Sharing
You can invite other Apify users to view or modify your request queues with the access rights system. See the full list of permissions.
Sharing request queues between runs
You can access a request queue from any Actor or task run as long as you know its name or ID.
To access a request queue from another run using the Apify SDK, open it using the Actor.openRequestQueue(queueIdOrName)
method like you would do with any other queue.
const otherQueue = await Actor.openRequestQueue('old-queue');
In the JavaScript API client, you can access a request queue using its client. Once you've opened the request queue, you can use it in your crawler or add new requests like you would do with a queue from your current run.
const otherQueueClient = apifyClient.requestQueue('jane-doe/old-queue');
Likewise, in the Python API client, you can access a request queue using its client.
other_queue_client = apify_client.request_queue('jane-doe/old-queue')
The same applies for the Apify API - you can use the same endpoints as you would normally do.
See the Storage overview for details on sharing storages between runs.
Limits
While multiple actor or task runs can add new requests to a queue concurrently, only one run can process a queue at any one time.
Request queue names can be up to 63 characters long.
Rate limiting
When managing request queues via API, CRUD (add, get, update, delete) operation requests are limited to 200 per second per request queue. This helps protect Apify servers from being overloaded.
All other request queue API endpoints are limited to 30 requests per second per request queue.
See the API documentation for details and to learn what to do if you exceed the rate limit.