Actor input schema specification
Learn how to define and validate a schema for your Actor's input with code examples. Provide an autogenerated input UI for your Actor's users.
The Actor input schema serves three main purposes:
- It ensures the input data supplied to the Actor adhere to specified requirements and validation rules.
- It is used by the Apify platform to generate a user-friendly interface for configuring and running your Actor.
- It simplifies invoking your Actors from external systems by generating calling code and connectors for integrations.
To define an input schema for an Actor, set input
field in the .actor/actor.json
file to an input schema object (described below), or path to a JSON file containing the input schema object.
For backwards compatibility, if the input
field is omitted, the system looks for an INPUT_SCHEMA.json
file either in the .actor
directory or the Actor's top-level directory—but note that this functionality is deprecated and might be removed in the future. The maximum allowed size for the input schema file is 500 kB.
When you provide an input schema, the Apify platform will validate the input data passed to the Actor on start (via the API or Apify Console) to ensure compliance before starting the Actor. If the input object doesn't conform the schema, the caller receives an error and the Actor is not started.
You can use our visual input schema editor to guide you through the creation of the INPUT_SCHEMA.json
file.
To ensure the input schema is valid, here's a corresponding JSON schema file.
You can also use the apify validate-schema
command in the Apify CLI.
Example
Imagine a simple web crawler that accepts an array of start URLs and a JavaScript function to execute on each visited page. The input schema for such a crawler could be defined as follows:
{
"title": "Cheerio Crawler input",
"description": "To update crawler to another site, you need to change startUrls and pageFunction options!",
"type": "object",
"schemaVersion": 1,
"properties": {
"startUrls": {
"title": "Start URLs",
"type": "array",
"description": "URLs to start with",
"prefill": [
{ "url": "http://example.com" },
{ "url": "http://example.com/some-path" }
],
"editor": "requestListSources"
},
"pageFunction": {
"title": "Page function",
"type": "string",
"description": "Function executed for each request",
"prefill": "async () => { return $('title').text(); }",
"editor": "javascript"
}
},
"required": ["startUrls", "pageFunction"]
}
The generated input UI will be:
If you switch the input to the JSON display using the toggle, then you will see the entered input stringified to JSON
, as it will be passed to the Actor:
{
"startUrls": [
{
"url": "http://example.com"
},
{
"url": "http://example.com/some-path"
}
],
"pageFunction": "async () => { return $('title').text(); }"
}
Structure
{
"title": "Cheerio Crawler input",
"type": "object",
"schemaVersion": 1,
"properties": { /* define input fields here */ },
"required": []
}
Property | Type | Required | Description |
---|---|---|---|
title | String | Yes | Any text describing your input schema. |
description | String | No | Help text for the input that will be displayed above the UI fields. |
type | String | Yes | This is fixed and must be set to string object . |
schemaVersion | Integer | Yes | The version of the input schema specification against which your schema is written. Currently, only version 1 is out. |
properties | Object | Yes | This is an object mapping each field key to its specification. |
required | String | No | An array of field keys that are required. |
Even though the structure of the Actor input schema is similar to JSON schema, there are some differences. We cannot guarantee that JSON schema tooling will work on input schema documents. For a more precise technical understanding of the matter, feel free to browse the code of the @apify/input_schema package.
Fields
Each field of your input is described under its key in the inputSchema.properties
object. The field might have integer
, string
, array
, object
, or boolean
type, and its specification contains the following properties:
Property | Value | Required | Description |
---|---|---|---|
type | One of
| Yes | Allowed type for the input value. Cannot be mixed. |
title | String | Yes | Title of the field in UI. |
description | String | Yes | Description of the field that will be displayed as help text in Actor input UI. |
default | Must match type property. | No | Default value that will be used when no value is provided. |
prefill | Must match type property. | No | Value that will be prefilled in the Actor input interface. Only the boolean type doesn't support prefill property. |
example | Must match type property. | No | Sample value of this field for the Actor to be displayed when Actor is published in Apify Store. |
sectionCaption | String | No | If this property is set, then all fields following this field (this field included) will be separated into a collapsible section with the value set as its caption. The section ends at the last field or the next field which has the sectionCaption property set. |
sectionDescription | String | No | If the sectionCaption property is set, then you can use this property to provide additional description to the section. The description will be visible right under the caption when the section is open. |
Prefill vs. default vs. required
Here is a rule of thumb for whether an input field should have a prefill
, default
, or be required:
- Prefill - Use for fields that don't have a reasonable default. The provided value is prefilled for the user to show them an example of using the field and to make it easy to test the Actor (e.g., search keyword, start URLs). In other words, this field is only used in the user interface but does not affect the Actor functionality and API.
- Required - Use for fields that don't have a reasonable default and MUST be entered by the user (e.g., API token, password).
- Default - Use for fields that MUST be set for the Actor run to some value, but where you don't need the user to change the default behavior (e.g., max pages to crawl, proxy settings). If the user omits the value when starting the Actor via any means (API, CLI, scheduler, or user interface), the platform automatically passes the Actor this default value.
- No particular setting - Use for purely optional fields where it makes no sense to prefill any value (e.g., flags like debug mode or download files).
In summary, you can use each option independently or use a combination of Prefill + Required, but the combinations Prefill + Default or Default + Required don't make sense to use.
Additional properties
Most types also support additional properties defining, for example, the UI input editor.
String
Example of a code input:
{
"title": "Page function",
"type": "string",
"description": "Function executed for each request",
"editor": "javascript",
"prefill": "async () => { return $('title').text(); }"
}
Rendered input:
Example of country selection using a select input:
{
"title": "Country",
"type": "string",
"description": "Select your country",
"editor": "select",
"default": "us",
"enum": ["us", "de", "fr"],
"enumTitles": ["USA", "Germany", "France"]
}
Rendered input:
Example of date selection using absolute and relative datepicker
editor:
{
"absoluteDate": {
"title": "Date",
"type": "string",
"description": "Select absolute date in format YYYY-MM-DD",
"editor": "datepicker",
"pattern": "^(\\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12]\\d|3[01])$"
},
"relativeDate": {
"title": "Relative date",
"type": "string",
"description": "Select relative date in format: {number} {unit}",
"editor": "datepicker",
"dateType": "relative",
"pattern": "^(\\d+)\\s*(day|week|month|year)s?$"
},
"anyDate": {
"title": "Any date",
"type": "string",
"description": "Select date in format YYYY-MM-DD or {number} {unit}",
"editor": "datepicker",
"dateType": "absoluteOrRelative",
"pattern": "^(\\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12]\\d|3[01])$|^(\\d+)\\s*(day|week|month|year)s?$"
}
}
The absoluteDate
property renders a date picker that allows selection of a specific date and returns string value in YYYY-MM-DD
format. Validation is ensured thanks to pattern
field. In this case the dateType
property is omitted, as it defaults to "absolute"
.
The relativeDate
property renders an input field that enables the user to choose the relative date and returns the value in {number} {unit}
format, for example "2 days"
. The dateType
parameter is set to "relative"
to restrict input to relative dates only.
The anyDate
property renders a date picker that accepts both absolute and relative dates. The Actor author is responsible for parsing and interpreting the selected date format.
Properties:
Property | Value | Required | Description |
---|---|---|---|
editor | One of
| Yes | Visual editor used for the input field. |
pattern | String | No | Regular expression that will be used to validate the input. If validation fails, the Actor will not run. |
minLength | Integer | No | Minimum length of the string. |
maxLength | Integer | No | Maximum length of the string. |
enum | [String] | Required if editor is select | Using this field, you can limit values to the given array of strings. Input will be displayed as select box. |
enumTitles | [String] | No | Titles for the enum keys described. |
nullable | Boolean | No | Specifies whether null is an allowed value. |
isSecret | Boolean | No | Specifies whether the input field will be stored encrypted. Only available with textfield and textarea editors. |
dateType | One of
| No | This property, which is only available with datepicker editor, specifies what date format should visual editor accept (The JSON editor accepts any string without validation.).
Defaults to absolute . |
When using escape characters \
for the regular expression in the pattern
field, be sure to escape them to avoid invalid JSON issues. For example, the regular expression
https:\/\/(www\.)?apify\.com\/.+
would become https:\\/\\/(www\\.)?apify\\.com\\/.+
.
Advanced date and time handling
While the datepicker
editor doesn't support setting time values visually, you can allow users to handle more complex datetime formats and pass them via JSON. The following regex allows users to optionally extend the date with full ISO datetime format or pass hours
and minutes
as a relative date:
"pattern": "^(\\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12]\\d|3[01])(T[0-2]\\d:[0-5]\\d(:[0-5]\\d)?(\\.\\d+)?Z?)?$|^(\\d+)\\s*(minute|hour|day|week|month|year)s?$"
When implementing time-based fields, make sure to explain to your users through the description that the time values should be provided in UTC. This helps prevent timezone-related issues.
Boolean
Example options with group caption:
{
"verboseLog": {
"title": "Verbose log",
"type": "boolean",
"description": "Debug messages will be included in the log.",
"default": true,
"groupCaption": "Options",
"groupDescription": "Various options for this Actor"
},
"lightspeed": {
"title": "Lightspeed",
"type": "boolean",
"description": "If checked then actors runs at the
speed of light.",
"prefill": true
}
}
Rendered input:
Properties:
Property | Value | Required | Description |
---|---|---|---|
editor | One of
| No | Visual editor used for the input field. |
groupCaption | String | No | If you want to group multiple checkboxes together, add this option to the first of the group. |
groupDescription | String | No | Description displayed as help text displayed of group title. |
nullable | Boolean | No | Specifies whether null is an allowed value. |
Integer
Example:
{
"title": "Memory",
"type": "integer",
"description": "Select memory in megabytes",
"default": 64,
"maximum": 1024,
"unit": "MB"
}
Rendered input:
Properties:
Property | Value | Required | Description |
---|---|---|---|
editor | One of:
| No | Visual editor used for input field. |
maximum | Integer | No | Maximum allowed value. |
minimum | Integer | No | Minimum allowed value. |
unit | String | No | Unit displayed next to the field in UI, for example second, MB, etc. |
nullable | Boolean | No | Specifies whether null is an allowed value. |
Object
Example of proxy configuration:
{
"title": "Proxy configuration",
"type": "object",
"description": "Select proxies to be used by your crawler.",
"prefill": { "useApifyProxy": true },
"editor": "proxy"
}
Rendered input:
The object where the proxy configuration is stored has the following structure:
{
// Indicates whether Apify Proxy was selected.
"useApifyProxy": Boolean,
// Array of Apify Proxy groups. Is missing or null if
// Apify Proxy's automatic mode was selected
// or if proxies are not used.
"apifyProxyGroups": String[],
// Array of custom proxy URLs.
// Is missing or null if custom proxies were not used.
"proxyUrls": String[],
}
Example of a black box object:
{
"title": "User object",
"type": "object",
"description": "Enter object representing user",
"prefill": {
"name": "John Doe",
"email": "janedoe@gmail.com"
},
"editor": "json"
}
Rendered input:
Properties:
Property | Value | Required | Description |
---|---|---|---|
editor | One of
| Yes | UI editor used for input. |
patternKey | String | No | Regular expression that will be used to validate the keys of the object. |
patternValue | String | No | Regular expression that will be used to validate the values of object. |
maxProperties | Integer | No | Maximum number of properties the object can have. |
minProperties | Integer | No | Minimum number of properties the object can have. |
nullable | Boolean | No | Specifies whether null is an allowed value. |
Array
Example of request list sources configuration:
{
"title": "Start URLs",
"type": "array",
"description": "URLs to start with",
"prefill": [{ "url": "https://apify.com" }],
"editor": "requestListSources"
}
Rendered input:
Example of an array:
{
"title": "Colors",
"type": "array",
"description": "Enter colors you know",
"prefill": ["Red", "White"],
"editor": "json"
}
Rendered input:
Properties:
Property | Value | Required | Description |
---|---|---|---|
editor | One of
| Yes | UI editor used for input. |
placeholderKey | String | No | Placeholder displayed for key field when no value is specified. Works only with keyValue editor. |
placeholderValue | String | No | Placeholder displayed in value field when no value is provided. Works only with keyValue and stringList editors. |
patternKey | String | No | Regular expression that will be used to validate the keys of items in the array. Works only with keyValue editor. |
patternValue | String | No | Regular expression that will be used to validate the values of items in the array. Works only with keyValue and stringList editors. |
maxItems | Integer | No | Maximum number of items the array can contain. |
minItems | Integer | No | Minimum number of items the array can contain. |
uniqueItems | Boolean | No | Specifies whether the array should contain only unique values. |
nullable | Boolean | No | Specifies whether null is an allowed value. |
items | object | No | Specifies format of the items of the array, useful mainly for multiselect (see below) |
Usage of this field is based on the selected editor:
requestListSources
- value from this field can be used as input for the RequestList class from Crawlee.pseudoUrls
- is intended to be used with a combination of the PseudoUrl class and the enqueueLinks() function from Crawlee.
Editor type requestListSources
supports input in formats defined by the sources property of RequestListOptions.
Editor type globs
maps to the Crawlee's GlobInput used by the UrlPatterObject.
Editor type select
allows the user to pick items from a select, providing multiple choices. Please check this example of how to define the multiselect field:
{
"title": "Multiselect field",
"description": "My multiselect field",
"type": "array",
"editor": "select",
"items": {
"type": "string",
"enum": ["value1", "value2", "value3"],
"enumTitles": ["Label of value1", "Label of value2", "Label of value3"]
}
}
To correctly define options for multiselect, you need to define the items
property and then provide values and (optionally) labels in enum
and enumTitles
properties.
Resource type
Resource type identifies what kind of Apify Platform object is referred to in the input field. For example, the Key-value store resource type can be referred to using a string ID. Currently, it supports storage resources only, allowing the reference of a Dataset, Key-Value Store or Request Queue.
For Actor developers, the resource input value is a string representing the storage ID.
The type of the property is either string
or array
. In case of array
(for multiple resources) the return value is an array of IDs.
In the user interface, a picker is provided for easy selection, where users can search and choose from their own storages or those they have access to.
Example of a Dataset input:
{
"title": "Dataset",
"type": "string",
"description": "Select a dataset",
"resourceType": "dataset"
}
Rendered input:
The returned value is resource reference, in this example it's the dataset ID as can be seen in the JSON tab:
Example of multiple datasets input:
{
"title": "Datasets",
"type": "array",
"description": "Select multiple datasets",
"resourceType": "dataset"
}
Rendered input:
Properties:
Property | Value | Required | Description |
---|---|---|---|
type | One of
| Yes | Specifies the type of input - string for single value or array for multiple values |
editor | One of
| No | Visual editor used for the input field. Defaults to resourcePicker . |
resourceType | One of
| Yes | Type of Apify Platform resource |
minItems | Integer | No | Minimum number of items the array can contain. Only for type: array |
maxItems | Integer | No | Maximum number of items the array can contain. Only for type: array |