Skip to main content

DatasetClientAsync

Sub-client for managing a specific dataset.

Provides methods to manage a specific dataset, e.g. get it, update it, or download its items. Obtain an instance via an appropriate method on the ApifyClientAsync class.

Hierarchy

Index

Methods

__init__

  • __init__(*, base_url, public_base_url, http_client, resource_path, client_registry, resource_id, params): None
  • Initialize the resource client.


    Parameters

    • keyword-onlybase_url: str

      API base URL.

    • keyword-onlypublic_base_url: str

      Public CDN base URL.

    • keyword-onlyhttp_client: HttpClientAsync

      HTTP client for making requests.

    • keyword-onlyresource_path: str

      Resource endpoint path (e.g., 'actors', 'datasets').

    • keyword-onlyclient_registry: ClientRegistryAsync

      Bundle of client classes for dependency injection.

    • optionalkeyword-onlyresource_id: str | None = None

      Optional resource ID for single-resource clients.

    • optionalkeyword-onlyparams: dict | None = None

      Optional default parameters for all requests.

    Returns None

create_items_public_url

  • async create_items_public_url(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view, expires_in): str
  • Generate a URL that can be used to access dataset items.

    If the client has permission to access the dataset's URL signing key, the URL will include a signature to verify its authenticity.

    You can optionally control how long the signed URL should be valid using the expires_in option. This value sets the expiration duration from the time the URL is generated. If not provided, the URL will not expire.

    Any other options (like limit or offset) will be included as query parameters in the URL.


    Parameters

    • optionalkeyword-onlyoffset: int | None = None
    • optionalkeyword-onlylimit: int | None = None
    • optionalkeyword-onlyclean: bool | None = None
    • optionalkeyword-onlydesc: bool | None = None
    • optionalkeyword-onlyfields: list[str] | None = None
    • optionalkeyword-onlyomit: list[str] | None = None
    • optionalkeyword-onlyunwind: list[str] | None = None
    • optionalkeyword-onlyskip_empty: bool | None = None
    • optionalkeyword-onlyskip_hidden: bool | None = None
    • optionalkeyword-onlyflatten: list[str] | None = None
    • optionalkeyword-onlyview: str | None = None
    • optionalkeyword-onlyexpires_in: timedelta | None = None

    Returns str

    The public dataset items URL.

delete

  • async delete(): None

get

get_items_as_bytes

  • async get_items_as_bytes(*, item_format, offset, limit, desc, clean, bom, delimiter, fields, omit, unwind, skip_empty, skip_header_row, skip_hidden, xml_root, xml_row, flatten, signature): bytes

  • Parameters

    • optionalkeyword-onlyitem_format: str = 'json'

      Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.

    • optionalkeyword-onlyoffset: int | None = None

      Number of items that should be skipped at the start. The default value is 0.

    • optionalkeyword-onlylimit: int | None = None

      Maximum number of items to return. By default there is no limit.

    • optionalkeyword-onlydesc: bool | None = None

      By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.

    • optionalkeyword-onlyclean: bool | None = None

      If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.

    • optionalkeyword-onlybom: bool | None = None

      All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.

    • optionalkeyword-onlydelimiter: str | None = None

      A delimiter character for CSV files. The default delimiter is a simple comma (,).

    • optionalkeyword-onlyfields: list[str] | None = None

      A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format. You can use this feature to effectively fix the output format.

    • optionalkeyword-onlyomit: list[str] | None = None

      A list of fields which should be omitted from the items.

    • optionalkeyword-onlyunwind: list[str] | None = None

      A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.

    • optionalkeyword-onlyskip_empty: bool | None = None

      If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.

    • optionalkeyword-onlyskip_header_row: bool | None = None

      If True, then header row in the csv format is skipped.

    • optionalkeyword-onlyskip_hidden: bool | None = None

      If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.

    • optionalkeyword-onlyxml_root: str | None = None

      Overrides default root element name of xml output. By default the root element is items.

    • optionalkeyword-onlyxml_row: str | None = None

      Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.

    • optionalkeyword-onlyflatten: list[str] | None = None

      A list of fields that should be flattened.

    • optionalkeyword-onlysignature: str | None = None

      Signature used to access the items.

    Returns bytes

    The dataset items as raw bytes.

get_statistics

iterate_items

  • async iterate_items(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, signature): AsyncIterator[dict]

  • Parameters

    • optionalkeyword-onlyoffset: int = 0

      Number of items that should be skipped at the start. The default value is 0.

    • optionalkeyword-onlylimit: int | None = None

      Maximum number of items to return. By default there is no limit.

    • optionalkeyword-onlyclean: bool | None = None

      If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.

    • optionalkeyword-onlydesc: bool | None = None

      By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.

    • optionalkeyword-onlyfields: list[str] | None = None

      A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.

    • optionalkeyword-onlyomit: list[str] | None = None

      A list of fields which should be omitted from the items.

    • optionalkeyword-onlyunwind: list[str] | None = None

      A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.

    • optionalkeyword-onlyskip_empty: bool | None = None

      If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.

    • optionalkeyword-onlyskip_hidden: bool | None = None

      If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.

    • optionalkeyword-onlysignature: str | None = None

      Signature used to access the items.

    Returns AsyncIterator[dict]

list_items

  • async list_items(*, offset, limit, clean, desc, fields, omit, unwind, skip_empty, skip_hidden, flatten, view, signature): DatasetItemsPage

  • Parameters

    • optionalkeyword-onlyoffset: int | None = None

      Number of items that should be skipped at the start. The default value is 0.

    • optionalkeyword-onlylimit: int | None = None

      Maximum number of items to return. By default there is no limit.

    • optionalkeyword-onlyclean: bool | None = None

      If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.

    • optionalkeyword-onlydesc: bool | None = None

      By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.

    • optionalkeyword-onlyfields: list[str] | None = None

      A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format.

    • optionalkeyword-onlyomit: list[str] | None = None

      A list of fields which should be omitted from the items.

    • optionalkeyword-onlyunwind: list[str] | None = None

      A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.

    • optionalkeyword-onlyskip_empty: bool | None = None

      If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.

    • optionalkeyword-onlyskip_hidden: bool | None = None

      If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.

    • optionalkeyword-onlyflatten: list[str] | None = None

      A list of fields that should be flattened.

    • optionalkeyword-onlyview: str | None = None

      Name of the dataset view to be used.

    • optionalkeyword-onlysignature: str | None = None

      Signature used to access the items.

    Returns DatasetItemsPage

    A page of the list of dataset items according to the specified filters.

push_items

  • async push_items(items): None

stream_items

  • async stream_items(*, item_format, offset, limit, desc, clean, bom, delimiter, fields, omit, unwind, skip_empty, skip_header_row, skip_hidden, xml_root, xml_row, signature): AsyncIterator[impit.Response]

  • Parameters

    • optionalkeyword-onlyitem_format: str = 'json'

      Format of the results, possible values are: json, jsonl, csv, html, xlsx, xml and rss. The default value is json.

    • optionalkeyword-onlyoffset: int | None = None

      Number of items that should be skipped at the start. The default value is 0.

    • optionalkeyword-onlylimit: int | None = None

      Maximum number of items to return. By default there is no limit.

    • optionalkeyword-onlydesc: bool | None = None

      By default, results are returned in the same order as they were stored. To reverse the order, set this parameter to True.

    • optionalkeyword-onlyclean: bool | None = None

      If True, returns only non-empty items and skips hidden fields (i.e. fields starting with the # character). The clean parameter is just a shortcut for skip_hidden=True and skip_empty=True parameters. Note that since some objects might be skipped from the output, that the result might contain less items than the limit value.

    • optionalkeyword-onlybom: bool | None = None

      All text responses are encoded in UTF-8 encoding. By default, csv files are prefixed with the UTF-8 Byte Order Mark (BOM), while json, jsonl, xml, html and rss files are not. If you want to override this default behavior, specify bom=True query parameter to include the BOM or bom=False to skip it.

    • optionalkeyword-onlydelimiter: str | None = None

      A delimiter character for CSV files. The default delimiter is a simple comma (,).

    • optionalkeyword-onlyfields: list[str] | None = None

      A list of fields which should be picked from the items, only these fields will remain in the resulting record objects. Note that the fields in the outputted items are sorted the same way as they are specified in the fields parameter. You can use this feature to effectively fix the output format. You can use this feature to effectively fix the output format.

    • optionalkeyword-onlyomit: list[str] | None = None

      A list of fields which should be omitted from the items.

    • optionalkeyword-onlyunwind: list[str] | None = None

      A list of fields which should be unwound, in order which they should be processed. Each field should be either an array or an object. If the field is an array then every element of the array will become a separate record and merged with parent object. If the unwound field is an object then it is merged with the parent object. If the unwound field is missing or its value is neither an array nor an object and therefore cannot be merged with a parent object, then the item gets preserved as it is. Note that the unwound items ignore the desc parameter.

    • optionalkeyword-onlyskip_empty: bool | None = None

      If True, then empty items are skipped from the output. Note that if used, the results might contain less items than the limit value.

    • optionalkeyword-onlyskip_header_row: bool | None = None

      If True, then header row in the csv format is skipped.

    • optionalkeyword-onlyskip_hidden: bool | None = None

      If True, then hidden fields are skipped from the output, i.e. fields starting with the # character.

    • optionalkeyword-onlyxml_root: str | None = None

      Overrides default root element name of xml output. By default the root element is items.

    • optionalkeyword-onlyxml_row: str | None = None

      Overrides default element name that wraps each page or page function result object in xml output. By default the element name is item.

    • optionalkeyword-onlysignature: str | None = None

      Signature used to access the items.

    Returns AsyncIterator[impit.Response]

    The dataset items as a context-managed streaming Response.

update

  • async update(*, name, general_access): Dataset

Properties

resource_id

resource_id: str | None

Get the resource ID.