Skip to main content

Request queue

Queue URLs for an Actor to visit in its run. Learn how to share your queues between Actor runs. Access and manage request queues from Apify Console or via API.


Request queues enable you to enqueue and retrieve requests such as URLs with an HTTP method and other parameters. They prove essential not only in web crawling scenarios but also in any situation requiring the management of a large number of URLs and the addition of new links.

The storage system for request queues accomoodates both breadth-first and depth-first crawling stategies, along with the inclusion of custom data attributes. This system enables you to check if certain URLs have already been encountered, add new URLs to the queue, and retrieve the next set of URLs fo processing.

Named request queues are retained indefinitely.
Unnamed request queues expire after 7 days unless otherwise specified.
Learn more

Basic usage

You can access your request queues in several ways:

Apify Console

In the Apify Console, you can view your request queues in the Storage section under the Request queues tab.

Request queues in app

To view a request queue, click on its Queue ID. Under the Actions menu, you can rename your queue's name (and, in turn, its retention period) and access rights using the Share button. Click on the API button to view and test a queue's API endpoints.

Request queues detail

JavaScript SDK

When working with a JavaScript Actor, the JavaScript SDK is an essential tool, especially for request queue management. The primary class for this purpose is the RequestQueue class. Use this class to decide whether your data is stored locally or in the Apify cloud.

If you are building a JavaScript Actor, you will be using the JavaScript SDK. The request queue is represented by a RequestQueue class. You can use the class to specify whether your data is stored locally or in the Apify cloud and enqueue new URLs.

Every Actor run is automatically linked with a default request queue, initiated upon adding the first request. This queue is primarily utilized for storing URLs to be crawled during the particular Actor run, though its use is not mandatory. For enhanced flexibility, you can establish named queues. These named queues offer the advantage of being shareable across different Actors or various Actor runs, facilitating a more interconnected and efficient process.

If you are storing your data locally, you can find your request queue at the following location.

{APIFY_LOCAL_STORAGE_DIR}/request_queues/{QUEUE_ID}/{ID}.json

The default request queue's ID is default. Each request in the queue is stored as a separate JSON file, where {ID} is a request ID.

To open a request queue, use the Actor.openRequestQueue() method.

// Import the JavaScript SDK into your project
import { Actor } from 'apify';

await Actor.init();
// ...

// Open the default request queue associated with
// the Actor run
const queue = await Actor.openRequestQueue();

// Open the 'my-queue' request queue
const queueWithName = await Actor.openRequestQueue('my-queue');

// ...
await Actor.exit();

Once a queue is open, you can manage it using the following methods. Check out the RequestQueue class's API reference for the full list.

// Import the JavaScript SDK into your project
import { Actor } from 'apify';

await Actor.init();
// ...

const queue = await Actor.openRequestQueue();

// Enqueue requests
await queue.addRequests([{ url: 'http://example.com/aaa' }]);
await queue.addRequests([
'http://example.com/foo',
'http://example.com/bar',
], { forefront: true });

// Get the next request from queue
const request1 = await queue.fetchNextRequest();
const request2 = await queue.fetchNextRequest();

// Get a specific request
const specificRequest = await queue.getRequest('shi6Nh3bfs3');

// Reclaim a failed request back to the queue
// and crawl it again
await queue.reclaimRequest(request2);

// Remove a queue
await queue.drop();

// ...
await Actor.exit();

Check out the JavaScript SDK documentation and the RequestQueue class's API reference for details on managing your request queues with the JavaScript SDK.

Python SDK

For Python Actor development, the Python SDK the in essential. The request queue is represented by RequestQueue class. Utilize this class to determine whether your data is stored locally or in in the Apify cloud. For managing your data, it provides the capability to enqueue new URLs, facilitating seamless integration and operation within your Actor.

Every Actor run is automatically connected to a default request queue, established specifically for that run upon the addition of the first request. If you're operating your Actors and choose to utilize this queue, it typically serves to store URLs for crawling in the respective Actor run, though its use is not mandatory. To extend functionality, you have the option to create named queue, which offer the flexibility to be shared among different Actors or across multiple Actor runs.

If you are storing your data locally, you can find your request queue at the following location.

{APIFY_LOCAL_STORAGE_DIR}/request_queues/{QUEUE_ID}/{ID}.json

The default request queue's ID is default. Each request in the queue is stored as a separate JSON file, where {ID} is a request ID.

To open a request queue, use the Actor.open_request_queue() method.

from apify import Actor

async def main():
async with Actor:
# Open the default request queue associated with the Actor run
queue = await Actor.open_request_queue()

# Open the 'my-queue' request queue
queue_with_name = await Actor.open_request_queue(name='my-queue')

# ...

Once a queue is open, you can manage it using the following methods. See the RequestQueue class's API reference for the full list.

from apify import Actor
from apify.storages import RequestQueue

async def main():
async with Actor:
queue: RequestQueue = await Actor.open_request_queue()

# Enqueue requests
await queue.add_request(request={'url': 'http:#example.com/aaa'})
await queue.add_request(request={'url': 'http:#example.com/foo'})
await queue.add_request(request={'url': 'http:#example.com/bar'}, forefront=True)

# Get the next requests from queue
request1 = await queue.fetch_next_request()
request2 = await queue.fetch_next_request()

# Get a specific request
specific_request = await queue.get_request('shi6Nh3bfs3')

# Reclaim a failed request back to the queue and crawl it again
await queue.reclaim_request(request2)

# Remove a queue
await queue.drop()

Check out the Python SDK documentation and the RequestQueue class's API reference for details on managing your request queues with the Python SDK.

JavaScript API client

The Apify JavaScript API client (apify-client) enables you to access your request queues from any Node.js application, whether it is running on the Apify platform or externally.

After importing and initiating the client, you can save each request queue to a variable for easier access.

const myQueueClient = apifyClient.requestQueue('jane-doe/my-request-queue');

You can then use that variable to access the request queue's items and manage it.

Check out the JavaScript API client documentation for help with setup and more details.

Python API client

The Apify Python API client (apify-client) allows you to access your request queues from any Python application, whether it is running on the Apify platform or externally.

After importing and initiating the client, you can save each request queue to a variable for easier access.

my_queue_client = apify_client.request_queue('jane-doe/my-request-queue')

You can then use that variable to access the request queue's items and manage it.

Check out the Python API client documentation for help with setup and more details.

Apify API

The Apify API allows you programmatic access to your request queues using HTTP requests.

If you are accessing your datasets using the username~store-name store ID format, you will need to use your secret API token. You can find the token (and your user ID) on the Integrations page of your Apify account.

When providing your API authentication token, we recommend using the request's Authorization header, rather than the URL. (More info).

To get a list of your request queues, send a GET request to the Get list of request queues endpoint.

https://api.apify.com/v2/request-queues

To get information about a request queue such as its creation time and item count, send a GET request to the Get request queue endpoint.

https://api.apify.com/v2/request-queues/{QUEUE_ID}

To get a request from a queue, send a GET request to the Get request endpoint.

https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests/{REQUEST_ID}

To add a request to a queue, send a POST request with the request to be added as a JSON object in the request's payload to the Add request endpoint.

https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests

Example payload:

{
"uniqueKey": "http://example.com",
"url": "http://example.com",
"method": "GET"
}

To update a request in a queue, send a PUT request with the request to update as a JSON object in the request's payload to the Update request endpoint. In the payload, specify the request's ID and add the information you want to update.

https://api.apify.com/v2/request-queues/{QUEUE_ID}/requests/{REQUEST_ID}

Example payload:

{
"id": "dnjkDMKLmdlkmlkmld",
"uniqueKey": "http://example.com",
"url": "http://example.com",
"method": "GET"
}

When adding or updating requests, you can optionally provide a clientKey parameter to your request. It must be a string between 1 and 32 characters in length. This identifier is used to determine whether the queue was accessed by multiple clients. If clientKey is not provided, the system considers this API call to come from a new client. See the hadMultipleClients field returned by the Get head operation for details.

Example: client-abc

For further details and a breakdown of each storage API endpoint, refer to the API documentation.

Sharing

You can grant access rights to your request queue through the Share button under the Actions menu. For more details check the full list of permissions.

Sharing request queues between runs

You can access a request queue from any Actor or task run as long as you know its name or ID.

To access a request queue from another run using the JavaScript SDK or the Python SDK, open it using the same method like you would do with any other request queue.

import { Actor } from 'apify';

await Actor.init();

const otherQueue = await Actor.openRequestQueue('old-queue');
// ...

await Actor.exit();

In the JavaScript API client as well as in Python API client, you can access a request queue using its respective client. Once you've opened the request queue, you can use it in your crawler or add new requests like you would do with a queue from your current run.

const otherQueueClient = apifyClient.requestQueue('jane-doe/old-queue');

The same applies for the Apify API - you can use the same endpoints as you would normally do.

Check out the Storage overview for details on sharing storages between runs.

Limits

  • While multiple Actor or task runs can add new requests to a queue concurrently, only one run can process a queue at any one time.

  • The maximum legnth for request queue nams is 63 characters.

Rate limiting

When managing request queues via API, CRUD (add, get, update, delete) operation requests are limited to 200 requests per second per request queue. This helps protect Apify servers from being overloaded.

All other request queue API endpoints are limited to 30 requests per second per request queue.

Check out the API documentation for more information and guidance on actions to take if you exceed these rate limits.