Waiting for elements and events
Learn the importance of waiting for content and events before running interaction or extraction code, as well as the best practices for doing so.
In a perfect world, every piece of content served on a website would be loaded instantaneously. We don't live in a perfect world though, and often times it can take anywhere between 1/10th of a second to a few seconds to load some content onto a page. Certain elements are also generated dynamically, which means that they are not present in the initial HTML and that they are created by scripts or data from API calls.
Puppeteer and Playwright don't sit around waiting for a page (or specific elements) to load though - if we tell it to do something with an element that hasn't been rendered yet, it'll start trying to do it (which will result in nasty errors). We've got to tell it to wait.
For a thorough explanation on how dynamic rendering works, give Dynamic pages a quick readover, and check out the examples.
Different events and elements can be waited for using the various waitFor...
methods offered.
Elements
In the previous lesson, we ran into an error with Puppeteer due to the fact that we weren't waiting for the .g a
selector to be present on the page before clicking it. The same error didn't occur in Playwright, because page.click()
automatically waits for the element to be visible on the page before clicking it.
Elements with specific selectors can be waited for by using the page.waitForSelector()
function. Let's use this knowledge to wait for the first result to be present on the page prior to clicking on it:
// This example is relevant for Puppeteer only!
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://www.google.com/');
await page.click('button + button');
await page.type('textarea[title]', 'hello world');
await page.keyboard.press('Enter');
// Wait for the element to be present on the page prior to clicking it
await page.waitForSelector('.g a');
await page.click('.g a');
await page.waitForTimeout(10000);
await browser.close();
Now, we won't see the error message anymore, and the first result will be successfully clicked by Puppeteer.
Playwright also has a
page.waitForSelector()
function and it's useful in other scenarios than clicking, or for more granular control over the waiting process.
Navigation
If we remember properly, after clicking the first result, we want to console log the title of the result's page and save a screenshot into the filesystem. In order to grab a solid screenshot of the loaded page though, we should wait for navigation before snapping the image. This can be done with page.waitForNavigation()
.
A navigation is when a new page load happens. First, the
domcontentloaded
event is fired, then theload
event.page.waitForNavigation()
will wait for theload
event to fire.
Naively, you might immediately think that this is the way we should wait for navigation after clicking the first result:
await page.click('.g a');
await page.waitForNavigation();
Though in theory this is correct, it can result in a race condition in which the page navigates quickly before the page.waitForNavigation()
function is ever run, which means that once it is finally called, it will hang and wait forever for the load
event event to fire even though it already fired. To solve this, we can stick the waiting logic and the clicking logic into a Promise.all()
call (placing page.waitForNavigation()
first).
await Promise.all([page.waitForNavigation(), page.click('.g a')]);
Though the line of code above is also valid in Playwright, it is recommended to use page.waitForLoadState('load')
instead of page.waitForNavigation()
, as it automatically handles the issues being solved by using Promise.all()
.
await page.click('.g a');
await page.waitForLoadState('load');
This implementation will do the following:
- Begin waiting for the page to navigate without blocking the
page.click()
function - Click the element, firing off a navigating event
- Resolve once the page has navigated, allowing further code to run
Our code so far
Here's what our project's code looks like so far:
- Playwright
- Puppeteer
import * as fs from 'fs/promises';
import { chromium } from 'playwright';
const browser = await chromium.launch({ headless: false });
// Create a page and visit Google
const page = await browser.newPage();
await page.goto('https://google.com');
// Agree to the cookies policy
await page.click('button:has-text("Accept all")');
// Type the query and visit the results page
await page.type('textarea[title]', 'hello world');
await page.keyboard.press('Enter');
// Click on the first result
await page.click('.g a');
await page.waitForLoadState('load');
// Our title extraction and screenshotting logic
// will go here
await page.waitForTimeout(10000);
await browser.close();
import * as fs from 'fs/promises';
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch({ headless: false });
// Create a page and visit Google
const page = await browser.newPage();
await page.goto('https://google.com');
// Agree to the cookies policy
await page.click('button + button');
// Type the query and visit the results page
await page.type('textarea[title]', 'hello world');
await page.keyboard.press('Enter');
// Wait for the first result to appear on the page,
// then click on it
await page.waitForSelector('.g a');
await Promise.all([page.waitForNavigation(), page.click('.g a')]);
// Our title extraction and screenshotting logic
// will go here
await page.waitForTimeout(10000);
await browser.close();
Next up
In the final lesson of the Opening & controlling a page section of this course, we'll be learning about various methods on Page which aren't related to directly interacting with a page or waiting for stuff, as well as finally adding the final touches to our mini-project (page title grabbing and screenshotting).