Apify SDK
The Apify SDK is a toolkit for building Actors—serverless microservices running (not only) on the Apify platform. Apify comes with first-class support for JavaScript/TypeScript and Python, but you can run any containerized code on the Apify platform.
SDK for JavaScript
Toolkit for building Actors—serverless microservices running (not only) on the Apify platform.
1npx apify-cli create my-crawler
1// The Apify SDK makes it easy to initialize the actor on the platform with the Actor.init() method,
2// and to save the scraped data from your Actors to a dataset by simply using the Actor.pushData() method.
3
4import { Actor } from 'apify';
5import { PlaywrightCrawler } from 'crawlee';
6
7await Actor.init();
8const crawler = new PlaywrightCrawler({
9 async requestHandler({ request, page, enqueueLinks }) {
10 const title = await page.title();
11 console.log(`Title of ${request.loadedUrl} is '${title}'`);
12 await Actor.pushData({ title, url: request.loadedUrl });
13 await enqueueLinks();
14 }
15});
16await crawler.run(['https://crawlee.dev']);
17await Actor.exit();
SDK for Python
The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
1apify create my-python-actor
1# The Apify SDK makes it easy to read the actor input with the Actor.get_input() method,
2# and to save the scraped data from your Actors to a dataset by simply using the Actor.push_data() method.
3
4from apify import Actor
5from bs4 import BeautifulSoup
6import requests
7
8async def main():
9 async with Actor:
10 actor_input = await Actor.get_input()
11 response = requests.get(actor_input['url'])
12 soup = BeautifulSoup(response.content, 'html.parser')
13 await Actor.push_data({ 'url': actor_input['url'], 'title': soup.title.string })