DatasetClient

Sub-client for manipulating a single dataset.

Hierarchy

ResourceClient
- DatasetClient

Index

Methods

Properties

Methods

init

__init__(*, base_url, root_client, http_client, resource_id, resource_path, params): None

Overrides ActorJobBaseClient.__init__
Initialize a new instance.
Parameters
- keyword-onlybase_url: str
  Base URL of the API server.
- keyword-onlyroot_client: ApifyClient
  The ApifyClient instance under which this resource client exists.
- keyword-onlyhttp_client: HTTPClient
  The HTTPClient instance to be used in this client.
- optionalkeyword-onlyresource_id: str | None = None
  ID of the manipulated resource, in case of a single-resource client.
- keyword-onlyresource_path: str
  Path to the resource's endpoint on the API server.
- optionalkeyword-onlyparams: dict | None = None
  Parameters to include in all requests from this client.
Returns None

create_items_public_url

create_items_public_url(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view, expires_in_secs): str

Generate a URL that can be used to access dataset items.

If the client has permission to access the dataset's URL signing key, the URL will include a signature to verify its authenticity.

You can optionally control how long the signed URL should be valid using the expires_in_secs option. This value sets the expiration duration in seconds from the time the URL is generated. If not provided, the URL will not expire.

Any other options (like limit or offset) will be included as query parameters in the URL.
Parameters
- optionalkeyword-onlyoffset: int | None = None
- optionalkeyword-onlylimit: int | None = None
- optionalkeyword-onlyclean: bool | None = None
- optionalkeyword-onlydesc: bool | None = None
- optionalkeyword-onlyfields: list[str] | None = None
- optionalkeyword-onlyomit: list[str] | None = None
- optionalkeyword-onlyunwind: list[str] | None = None
- optionalkeyword-onlyskip_empty: bool | None = None
- optionalkeyword-onlyskip_hidden: bool | None = None
- optionalkeyword-onlyflatten: list[str] | None = None
- optionalkeyword-onlyview: str | None = None
- optionalkeyword-onlyexpires_in_secs: int | None = None
Returns str

delete

delete(): None

Delete the dataset.

https://docs.apify.com/api/v2#/reference/datasets/dataset/delete-dataset
Returns None

download_items

download_items(*, item_format, offset, limit, desc, clean, bom, delimiter, fields, omit, unwind, skip_empty, skip_header_row, skip_hidden, xml_root, xml_row, flatten): bytes

Get the items in the dataset as raw bytes.

Deprecated: this function is a deprecated alias of get_items_as_bytes. It will be removed in a future version.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items
Parameters
- optionalkeyword-onlyitem_format: str = 'json'
  Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
- optionalkeyword-onlyoffset: int | None = None
  Number of items that should be skipped at the start. The default value is 0.
- optionalkeyword-onlylimit: int | None = None
  Maximum number of items to return. By default there is no limit.
- optionalkeyword-onlydesc: bool | None = None
  By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- optionalkeyword-onlyclean: bool | None = None
  If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- optionalkeyword-onlybom: bool | None = None
  All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.
- optionalkeyword-onlydelimiter: str | None = None
  A delimiter character for CSV files. The default delimiter is a simple comma (,).
- optionalkeyword-onlyfields: list[str] | None = None
  A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- optionalkeyword-onlyomit: list[str] | None = None
  A list of fields which should be omitted from the items.
- optionalkeyword-onlyunwind: list[str] | None = None
  A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- optionalkeyword-onlyskip_empty: bool | None = None
  If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- optionalkeyword-onlyskip_header_row: bool | None = None
  If True, then header row in the csv format is skipped.
- optionalkeyword-onlyskip_hidden: bool | None = None
  If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- optionalkeyword-onlyxml_root: str | None = None
  Overrides default root element name of xml output. By default the root element is items.
- optionalkeyword-onlyxml_row: str | None = None
  Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.
- optionalkeyword-onlyflatten: list[str] | None = None
  A list of fields that should be flattened.
Returns bytes

get

get(): dict | None

Retrieve the dataset.

https://docs.apify.com/api/v2#/reference/datasets/dataset/get-dataset
Returns dict | None

get_items_as_bytes

get_items_as_bytes(*, item_format, offset, limit, desc, clean, bom, delimiter, fields, omit, unwind, skip_empty, skip_header_row, skip_hidden, xml_root, xml_row, flatten): bytes

Get the items in the dataset as raw bytes.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items
Parameters
- optionalkeyword-onlyitem_format: str = 'json'
  Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
- optionalkeyword-onlyoffset: int | None = None
  Number of items that should be skipped at the start. The default value is 0.
- optionalkeyword-onlylimit: int | None = None
  Maximum number of items to return. By default there is no limit.
- optionalkeyword-onlydesc: bool | None = None
  By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- optionalkeyword-onlyclean: bool | None = None
  If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- optionalkeyword-onlybom: bool | None = None
  All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.
- optionalkeyword-onlydelimiter: str | None = None
  A delimiter character for CSV files. The default delimiter is a simple comma (,).
- optionalkeyword-onlyfields: list[str] | None = None
  A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format. You can use this feature to effectively fix the output format.
- optionalkeyword-onlyomit: list[str] | None = None
  A list of fields which should be omitted from the items.
- optionalkeyword-onlyunwind: list[str] | None = None
  A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- optionalkeyword-onlyskip_empty: bool | None = None
  If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- optionalkeyword-onlyskip_header_row: bool | None = None
  If True, then header row in the csv format is skipped.
- optionalkeyword-onlyskip_hidden: bool | None = None
  If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- optionalkeyword-onlyxml_root: str | None = None
  Overrides default root element name of xml output. By default the root element is items.
- optionalkeyword-onlyxml_row: str | None = None
  Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.
- optionalkeyword-onlyflatten: list[str] | None = None
  A list of fields that should be flattened.
Returns bytes

get_statistics

get_statistics(): dict | None

Get the dataset statistics.

https://docs.apify.com/api/v2#tag/DatasetsStatistics/operation/dataset_statistics_get
Returns dict | None

iterate_items

iterate_items(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden): Iterator[dict]

Iterate over the items in the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items
Parameters
- optionalkeyword-onlyoffset: int = 0
  Number of items that should be skipped at the start. The default value is 0.
- optionalkeyword-onlylimit: int | None = None
  Maximum number of items to return. By default there is no limit.
- optionalkeyword-onlyclean: bool | None = None
  If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- optionalkeyword-onlydesc: bool | None = None
  By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- optionalkeyword-onlyfields: list[str] | None = None
  A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- optionalkeyword-onlyomit: list[str] | None = None
  A list of fields which should be omitted from the items.
- optionalkeyword-onlyunwind: list[str] | None = None
  A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- optionalkeyword-onlyskip_empty: bool | None = None
  If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- optionalkeyword-onlyskip_hidden: bool | None = None
  If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
Returns Iterator[dict]

list_items

list_items(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view): ListPage

List the items of the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items
Parameters
- optionalkeyword-onlyoffset: int | None = None
  Number of items that should be skipped at the start. The default value is 0.
- optionalkeyword-onlylimit: int | None = None
  Maximum number of items to return. By default there is no limit.
- optionalkeyword-onlyclean: bool | None = None
  If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- optionalkeyword-onlydesc: bool | None = None
  By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- optionalkeyword-onlyfields: list[str] | None = None
  A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.
- optionalkeyword-onlyomit: list[str] | None = None
  A list of fields which should be omitted from the items.
- optionalkeyword-onlyunwind: list[str] | None = None
  A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- optionalkeyword-onlyskip_empty: bool | None = None
  If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- optionalkeyword-onlyskip_hidden: bool | None = None
  If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- optionalkeyword-onlyflatten: list[str] | None = None
  A list of fields that should be flattened.
- optionalkeyword-onlyview: str | None = None
  Name of the dataset view to be used.
Returns ListPage

push_items

push_items(items): None

Push items to the dataset.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/put-items
Parameters
- items: JSONSerializable
  The items which to push in the dataset. Either a stringified JSON, a dictionary, or a list of strings or dictionaries.
Returns None

stream_items

stream_items(*, item_format, offset, limit, desc, clean, bom, delimiter, fields, omit, unwind, skip_empty, skip_header_row, skip_hidden, xml_root, xml_row): Iterator[impit.Response]

Retrieve the items in the dataset as a stream.

https://docs.apify.com/api/v2#/reference/datasets/item-collection/get-items
Parameters
- optionalkeyword-onlyitem_format: str = 'json'
  Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.
- optionalkeyword-onlyoffset: int | None = None
  Number of items that should be skipped at the start. The default value is 0.
- optionalkeyword-onlylimit: int | None = None
  Maximum number of items to return. By default there is no limit.
- optionalkeyword-onlydesc: bool | None = None
  By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.
- optionalkeyword-onlyclean: bool | None = None
  If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.
- optionalkeyword-onlybom: bool | None = None
  All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.
- optionalkeyword-onlydelimiter: str | None = None
  A delimiter character for CSV files. The default delimiter is a simple comma (,).
- optionalkeyword-onlyfields: list[str] | None = None
  A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format. You can use this feature to effectively fix the output format.
- optionalkeyword-onlyomit: list[str] | None = None
  A list of fields which should be omitted from the items.
- optionalkeyword-onlyunwind: list[str] | None = None
  A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.
- optionalkeyword-onlyskip_empty: bool | None = None
  If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.
- optionalkeyword-onlyskip_header_row: bool | None = None
  If True, then header row in the csv format is skipped.
- optionalkeyword-onlyskip_hidden: bool | None = None
  If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.
- optionalkeyword-onlyxml_root: str | None = None
  Overrides default root element name of xml output. By default the root element is items.
- optionalkeyword-onlyxml_row: str | None = None
  Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.
Returns Iterator[impit.Response]

update

update(*, name, general_access): dict

Update the dataset with specified fields.

https://docs.apify.com/api/v2#/reference/datasets/dataset/update-dataset
Parameters
- optionalkeyword-onlyname: str | None = None
  The new name for the dataset.
- optionalkeyword-onlygeneral_access: StorageGeneralAccess | None = None
  Determines how others can access the dataset.
Returns dict

Properties

http_client

http_client: HTTPClient | HTTPClientAsync

params

params: dict

resource_id

resource_id: str | None

root_client

root_client: ApifyClient | ApifyClientAsync

url

url: str

Hierarchy

Index

Methods

Properties

Methods

__init__

Parameters

keyword-onlybase_url: str

keyword-onlyroot_client: ApifyClient

keyword-onlyhttp_client: HTTPClient

optionalkeyword-onlyresource_id: str | None = None

keyword-onlyresource_path: str

optionalkeyword-onlyparams: dict | None = None

Returns None

create_items_public_url

Parameters

optionalkeyword-onlyoffset: int | None = None

optionalkeyword-onlylimit: int | None = None

optionalkeyword-onlyclean: bool | None = None

optionalkeyword-onlydesc: bool | None = None

optionalkeyword-onlyfields: list[str] | None = None

optionalkeyword-onlyomit: list[str] | None = None

optionalkeyword-onlyunwind: list[str] | None = None

optionalkeyword-onlyskip_empty: bool | None = None

optionalkeyword-onlyskip_hidden: bool | None = None

optionalkeyword-onlyflatten: list[str] | None = None

optionalkeyword-onlyview: str | None = None

optionalkeyword-onlyexpires_in_secs: int | None = None

Returns str

delete

Returns None

download_items

Parameters

optionalkeyword-onlyitem_format: str = 'json'

optionalkeyword-onlyoffset: int | None = None

optionalkeyword-onlylimit: int | None = None

optionalkeyword-onlydesc: bool | None = None

optionalkeyword-onlyclean: bool | None = None

optionalkeyword-onlybom: bool | None = None

optionalkeyword-onlydelimiter: str | None = None

optionalkeyword-onlyfields: list[str] | None = None

optionalkeyword-onlyomit: list[str] | None = None

optionalkeyword-onlyunwind: list[str] | None = None

optionalkeyword-onlyskip_empty: bool | None = None

optionalkeyword-onlyskip_header_row: bool | None = None

optionalkeyword-onlyskip_hidden: bool | None = None

optionalkeyword-onlyxml_root: str | None = None

optionalkeyword-onlyxml_row: str | None = None

optionalkeyword-onlyflatten: list[str] | None = None

Returns bytes

get

Returns dict | None

get_items_as_bytes

Parameters

optionalkeyword-onlyitem_format: str = 'json'

optionalkeyword-onlyoffset: int | None = None

optionalkeyword-onlylimit: int | None = None

optionalkeyword-onlydesc: bool | None = None

optionalkeyword-onlyclean: bool | None = None

optionalkeyword-onlybom: bool | None = None

optionalkeyword-onlydelimiter: str | None = None

optionalkeyword-onlyfields: list[str] | None = None

optionalkeyword-onlyomit: list[str] | None = None

optionalkeyword-onlyunwind: list[str] | None = None

optionalkeyword-onlyskip_empty: bool | None = None

optionalkeyword-onlyskip_header_row: bool | None = None

optionalkeyword-onlyskip_hidden: bool | None = None

optionalkeyword-onlyxml_root: str | None = None

optionalkeyword-onlyxml_row: str | None = None

optionalkeyword-onlyflatten: list[str] | None = None

Returns bytes

get_statistics

Returns dict | None

iterate_items

Parameters

optionalkeyword-onlyoffset: int = 0

optionalkeyword-onlylimit: int | None = None

optionalkeyword-onlyclean: bool | None = None

optionalkeyword-onlydesc: bool | None = None

optionalkeyword-onlyfields: list[str] | None = None

init