Skip to main content

Apify API

UPDATE 2025-01-14: We have rolled out this new Apify API Documentation. In case of any issues, please report here. The old API Documentation is still available here.

The Apify API (version 2) provides programmatic access to the Apify platform. The API is organized around RESTful HTTP endpoints.

You can download the complete OpenAPI schema of Apify API in the YAML or JSON formats. The source code is also available on GitHub.

All requests and responses (including errors) are encoded in JSON format with UTF-8 encoding, with a few exceptions that are explicitly described in the reference.

To access the API using Node.js, we recommend the apify-client NPM package.

To access the API using Python, we recommend the apify-client PyPI package. The clients' functions correspond to the API endpoints and have the same parameters. This simplifies development of apps that depend on the Apify platform.

Note: All requests with JSON payloads need to specify the Content-Type: application/json HTTP header! All API endpoints support the method query parameter that can override the HTTP method. For example, if you want to call a POST endpoint using a GET request, simply add the query parameter method=POST to the URL and send the GET request. This feature is especially useful if you want to call Apify API endpoints from services that can only send GET requests.

Authentication

You can find your API token on the Integrations page in the Apify Console.

To use your token in a request, either:

  • Add the token to your request's Authorization header as Bearer <token>. E.g., Authorization: Bearer xxxxxxx. More info. (Recommended).
  • Add it as the token parameter to your request URL. (Less secure).

Using your token in the request header is more secure than using it as a URL parameter because URLs are often stored in browser history and server logs. This creates a chance for someone unauthorized to access your API token.

Do not share your API token or password with untrusted parties.

For more information, see our integrations documentation.

Basic usage

To run an Actor, send a POST request to the Run Actor endpoint using either the Actor ID code (e.g. vKg4IjxZbEYTYeW8T) or its name (e.g. janedoe~my-actor):

https://api.apify.com/v2/acts/[actor_id]/runs

If the Actor is not runnable anonymously, you will receive a 401 or 403 response code. This means you need to add your secret API token to the request's Authorization header (recommended) or as a URL query parameter ?token=[your_token] (less secure).

Optionally, you can include the query parameters described in the Run Actor section to customize your run.

If you're using Node.js, the best way to run an Actor is using the Apify.call() method from the Apify SDK. It runs the Actor using the account you are currently logged into (determined by the secret API token). The result is an Actor run object and its output (if any).

A typical workflow is as follows:

  1. Run an Actor or task using the Run Actor or Run task API endpoints.
  2. Monitor the Actor run by periodically polling its progress using the Get run API endpoint.
  3. Fetch the results from the Get items API endpoint using the defaultDatasetId, which you receive in the Run request response. Additional data may be stored in a key-value store. You can fetch them from the Get record API endpoint using the defaultKeyValueStoreId and the store's key.

Note: Instead of periodic polling, you can also run your Actor or task synchronously. This will ensure that the request waits for 300 seconds (5 minutes) for the run to finish and returns its output. If the run takes longer, the request will time out and throw an error.

Response structure

Most API endpoints return a JSON object with the data property:

{
"data": {
...
}
}

However, there are a few explicitly described exceptions, such as Dataset Get items or Key-value store Get record API endpoints, which return data in other formats. In case of an error, the response has the HTTP status code in the range of 4xx or 5xx and the data property is replaced with error. For example:

{
"error": {
"type": "record-not-found",
"message": "Store was not found."
}
}

See Errors for more details.

Pagination

All API endpoints that return a list of records (e.g. Get list of Actors) enforce pagination in order to limit the size of their responses.

Most of these API endpoints are paginated using the offset and limit query parameters. The only exception is Get list of keys, which is paginated using the exclusiveStartKey query parameter.

IMPORTANT: Each API endpoint that supports pagination enforces a certain maximum value for the limit parameter, in order to reduce the load on Apify servers. The maximum limit could change in future so you should never rely on a specific value and check the responses of these API endpoints.

Using offset

Most API endpoints that return a list of records enable pagination using the following query parameters:

limitLimits the response to contain a specific maximum number of items, e.g. limit=20.
offsetSkips a number of items from the beginning of the list, e.g. offset=100.
desc

By default, items are sorted in the order in which they were created or added to the list. This feature is useful when fetching all the items, because it ensures that items created after the client started the pagination will not be skipped. If you specify the desc=1 parameter, the items will be returned in the reverse order, i.e. from the newest to the oldest items.

The response of these API endpoints is always a JSON object with the following structure:

{
"data": {
"total": 2560,
"offset": 250,
"limit": 1000,
"count": 1000,
"desc": false,
"items": [
{ 1st object },
{ 2nd object },
...
{ 1000th object }
]
}
}

The following table describes the meaning of the response properties:

PropertyDescription
totalThe total number of items available in the list.
offsetThe number of items that were skipped at the start. This is equal to the offset query parameter if it was provided, otherwise it is 0.
limitThe maximum number of items that can be returned in the HTTP response. It equals to the limit query parameter if it was provided or the maximum limit enforced for the particular API endpoint, whichever is smaller.
countThe actual number of items returned in the HTTP response.
desctrue if data were requested in descending order and false otherwise.
itemsAn array of requested items.

Using key

The records in the key-value store are not ordered based on numerical indexes, but rather by their keys in the UTF-8 binary order. Therefore the Get list of keys API endpoint only supports pagination using the following query parameters:

limitLimits the response to contain a specific maximum number items, e.g. limit=20.
exclusiveStartKeySkips all records with keys up to the given key including the given key, in the UTF-8 binary order.

The response of the API endpoint is always a JSON object with following structure:

{
"data": {
"limit": 1000,
"isTruncated": true,
"exclusiveStartKey": "my-key",
"nextExclusiveStartKey": "some-other-key",
"items": [
{ 1st object },
{ 2nd object },
...
{ 1000th object }
]
}
}

The following table describes the meaning of the response properties:

PropertyDescription
limitThe maximum number of items that can be returned in the HTTP response. It equals to the limit query parameter if it was provided or the maximum limit enforced for the particular endpoint, whichever is smaller.
isTruncatedtrue if there are more items left to be queried. Otherwise false.
exclusiveStartKeyThe last key that was skipped at the start. Is null for the first page.
nextExclusiveStartKeyThe value for the exclusiveStartKey parameter to query the next page of items.

Errors

The Apify API uses common HTTP status codes: 2xx range for success, 4xx range for errors caused by the caller (invalid requests) and 5xx range for server errors (these are rare). Each error response contains a JSON object defining the error property, which is an object with the type and message properties that contain the error code and a human-readable error description, respectively.

For example:

{
"error": {
"type": "record-not-found",
"message": "Store was not found."
}
}

Here is the table of the most common errors that can occur for many API endpoints:

statustypemessage
400invalid-requestPOST data must be a JSON object
400invalid-valueInvalid value provided: Comments required
400invalid-record-keyRecord key contains invalid character
401token-not-providedAuthentication token was not provided
404record-not-foundStore was not found
429rate-limit-exceededYou have exceeded the rate limit of 30 requests per second
405method-not-allowedThis API endpoint can only be accessed using the following HTTP methods: OPTIONS, POST

Rate limiting

All API endpoints limit the rate of requests in order to prevent overloading of Apify servers by misbehaving clients.

There are two kinds of rate limits - a global rate limit and a per-resource rate limit.

Global rate limit

The global rate limit is set to 250 000 requests per minute. For authenticated requests, it is counted per user, and for unauthenticated requests, it is counted per IP address.

Per-resource rate limit

The default per-resource rate limit is 30 requests per second per resource, which in this context means a single Actor, a single Actor run, a single dataset, single key-value store etc. The default rate limit is applied to every API endpoint except a few select ones, which have higher rate limits. Each API endpoint returns its rate limit in X-RateLimit-Limit header.

These endpoints have a rate limit of 100 requests per second per resource:

  • CRUD (get, put, delete) operations on key-value store records

These endpoints have a rate limit of 200 requests per second per resource:

Rate limit exceeded errors

If the client is sending too many requests, the API endpoints respond with the HTTP status code 429 Too Many Requests and the following body:

{
"error": {
"type": "rate-limit-exceeded",
"message": "You have exceeded the rate limit of ... requests per second"
}
}

Retrying rate-limited requests with exponential backoff

If the client receives the rate limit error, it should wait a certain period of time and then retry the request. If the error happens again, the client should double the wait period and retry the request, and so on. This algorithm is known as exponential backoff and it can be described using the following pseudo-code:

  1. Define a variable DELAY=500
  2. Send the HTTP request to the API endpoint
  3. If the response has status code not equal to 429 then you are done. Otherwise:
    • Wait for a period of time chosen randomly from the interval DELAY to 2*DELAY milliseconds
    • Double the future wait period by setting DELAY = 2*DELAY
    • Continue with step 2

If all requests sent by the client implement the above steps, the client will automatically use the maximum available bandwidth for its requests.

Note that the Apify API clients for JavaScript and for Python use the exponential backoff algorithm transparently, so that you do not need to worry about it.

Referring to resources

There are three main ways to refer to a resource you're accessing via API.

  • the resource ID (e.g. iKkPcIgVvwmztduf8)
  • username~resourcename - when using this access method, you will need to use your API token, and access will only work if you have the correct permissions.
  • ~resourcename - for this, you need to use an API token, and the resourcename refers to a resource in the API token owner's account.

Authentication

API authentication token.

Security Scheme Type:

http

HTTP Authorization Scheme:

bearer