Dataset
Index
Methods
drop
{"content": ["Remove the dataset either from the Apify cloud storage or from the local directory."]}
Returns None
export_to
{"content": ["Save the entirety of the dataset's contents into one file within a key-value store.\n", {"
Parameters
key: str
keyword-onlyto_key_value_store_id: str | None = None
keyword-onlyto_key_value_store_name: str | None = None
keyword-onlycontent_type: str | None = None
Returns None
export_to_csv
{"content": ["Save the entirety of the dataset's contents into one CSV file within a key-value store.\n", {"
Parameters
key: str
keyword-onlyfrom_dataset_id: str | None = None
keyword-onlyfrom_dataset_name: str | None = None
keyword-onlyto_key_value_store_id: str | None = None
keyword-onlyto_key_value_store_name: str | None = None
Returns None
export_to_json
{"content": ["Save the entirety of the dataset's contents into one JSON file within a key-value store.\n", {"
Parameters
key: str
keyword-onlyfrom_dataset_id: str | None = None
keyword-onlyfrom_dataset_name: str | None = None
keyword-onlyto_key_value_store_id: str | None = None
keyword-onlyto_key_value_store_name: str | None = None
Returns None
get_data
{"content": ["Get items from the dataset.\n", {"
Parameters
keyword-onlyoffset: int | None = None
keyword-onlylimit: int | None = None
keyword-onlyclean: bool | None = None
keyword-onlydesc: bool | None = None
keyword-onlyfields: list[str] | None = None
keyword-onlyomit: list[str] | None = None
keyword-onlyunwind: str | None = None
keyword-onlyskip_empty: bool | None = None
keyword-onlyskip_hidden: bool | None = None
keyword-onlyflatten: list[str] | None = None
keyword-onlyview: str | None = None
Returns ListPage
get_info
{"content": ["Get an object containing general information about the dataset.\n", {"
Returns dict | None
iterate_items
{"content": ["Iterate over the items in the dataset.\n", {"
Parameters
keyword-onlyoffset: int = 0
keyword-onlylimit: int | None = None
keyword-onlyclean: bool | None = None
keyword-onlydesc: bool | None = None
keyword-onlyfields: list[str] | None = None
keyword-onlyomit: list[str] | None = None
keyword-onlyunwind: str | None = None
keyword-onlyskip_empty: bool | None = None
keyword-onlyskip_hidden: bool | None = None
Returns AsyncIterator[dict]
open
{"content": ["Open a dataset.\n\nDatasets are used to store structured data where each object stored has the same attributes,\nsuch as online store products or real estate offers.\nThe actual data is stored either on the local filesystem or in the Apify cloud.\n", {"
Parameters
keyword-onlyid: str | None = None
keyword-onlyname: str | None = None
keyword-onlyforce_cloud: bool = False
keyword-onlyconfig: Configuration | None = None
Returns Dataset
push_data
{"content": ["Store an object or an array of objects to the dataset.\n\nThe size of the data is limited by the receiving API and therefore
push_data()will only\nallow objects whose JSON representation is smaller than 9MB. When an array is passed,\nnone of the included objects may be larger than 9MB, but the array itself may be of any size.\n", {"Parameters
data: JSONSerializable
Returns None
{"content": ["The
Datasetclass represents a store for structured data where each object stored has the same attributes.\n\nYou can imagine it as a table, where each object is a row and its attributes are columns.\nDataset is an append-only storage - you can only add new records to it but you cannot modify or remove existing records.\nTypically it is used to store crawling results.\n\nDo not instantiate this class directly, use theActor.open_dataset()function instead.\n\nDatasetstores its data either on local disk or in the Apify cloud,\ndepending on whether theAPIFY_LOCAL_STORAGE_DIRorAPIFY_TOKENenvironment variables are set.\n\nIf theAPIFY_LOCAL_STORAGE_DIRenvironment variable is set, the data is stored in\nthe local directory in the following files:\n``\n{APIFY_LOCAL_STORAGE_DIR}/datasets/{DATASET_ID}/{INDEX}.json\n``", "Note that{DATASET_ID}is the name or ID of the dataset. The default dataset has ID:default,\nunless you override it by setting theAPIFY_DEFAULT_DATASET_IDenvironment variable.\nEach dataset item is stored as a separate JSON file, where{INDEX}is a zero-based index of the item in the dataset.\n\nIf theAPIFY_TOKENenvironment variable is set butAPIFY_LOCAL_STORAGE_DIRis not, the data is stored in the\nApify Dataset cloud storage."]}