Version: 3.0

Crawl multiple URLs

This example crawls the specified list of URLs.

Cheerio Crawler
Puppeteer Crawler
Playwright Crawler

Using CheerioCrawler:

import { Actor } from 'apify';
import { CheerioCrawler } from 'crawlee';

await Actor.init();

const crawler = new CheerioCrawler({
    // Function called for each URL
    async requestHandler({ request, $ }) {
        const title = $('title').text();
        console.log(`URL: ${request.url}\nTITLE: ${title}`);
    },
});

// Run the crawler
await crawler.run([
    'http://www.example.com/page-1',
    'http://www.example.com/page-2',
    'http://www.example.com/page-3',
]);

await Actor.exit();

Using PuppeteerCrawler:

tip

To run this example on the Apify Platform, select the apify/actor-node-puppeteer-chrome image for your Dockerfile.

import { Actor } from 'apify';
import { PuppeteerCrawler } from 'crawlee';

await Actor.init();

const crawler = new PuppeteerCrawler({
    // Function called for each URL
    async requestHandler({ request, page }) {
        const title = await page.title();
        console.log(`URL: ${request.url}\nTITLE: ${title}`);
    },
});

// Run the crawler
await crawler.run([
    'http://www.example.com/page-1',
    'http://www.example.com/page-2',
    'http://www.example.com/page-3',
]);

await Actor.exit();

Using PlaywrightCrawler:

tip

To run this example on the Apify Platform, select the apify/actor-node-playwright-chrome image for your Dockerfile.

import { Actor } from 'apify';
import { PlaywrightCrawler } from 'crawlee';

await Actor.init();

const crawler = new PlaywrightCrawler({
    // Function called for each URL
    async requestHandler({ request, page }) {
        const title = await page.title();
        console.log(`URL: ${request.url}\nTITLE: ${title}`);
    },
});

// Run the crawler
await crawler.run([
    'http://www.example.com/page-1',
    'http://www.example.com/page-2',
    'http://www.example.com/page-3',
]);

await Actor.exit();