Skip to main content
Version: 3.4

Browser AI agents with Browser Use

In this guide, you'll learn how to use the Browser Use library to drive a browser with an LLM agent in your Apify Actors.

Introduction

Browser Use is a Python library that lets an LLM control a real web browser. Instead of writing selectors and navigation steps by hand, you give an agent a natural-language task, such as "find the top post on Hacker News and return its title and URL". The agent then decides which pages to open, what to click, and what to read until the task is done.

Browser Use is a great fit for Apify Actors:

  • Describe what you want in plain English and the agent figures out the steps. This is especially useful with pages whose structure changes often or is hard to target with fixed selectors.
  • Browser Use ships wrappers for many providers, for example ChatOpenAI, ChatAnthropic, or ChatGoogle. You can pick the model that fits your task and budget.
  • Pass a Pydantic model as the output schema and the agent returns a validated object that maps onto an Apify dataset.
  • The agent drives a real Chromium over the Chrome DevTools Protocol, so JavaScript-heavy pages render just like they would for a human.
  • The agent's run method is asynchronous, which integrates naturally with the asyncio-based Apify SDK.

Browser Use needs only the browser-use package. To install it, use:

pip install browser-use

Configuring the LLM

Browser Use needs an LLM to drive the agent. You choose a provider wrapper, give it a model name, and supply the provider's API key:

  • ChatOpenAI - OpenAI models such as gpt-4.1-mini or gpt-5-mini. Reads the key from OPENAI_API_KEY, or accepts it via the api_key argument.
  • ChatAnthropic - Anthropic Claude models such as claude-sonnet-4-5 or claude-haiku-4-5. Reads the key from ANTHROPIC_API_KEY.
  • ChatGoogle - Google Gemini models such as gemini-2.5-flash. Reads the key from GOOGLE_API_KEY.

The example Actor in this guide uses ChatOpenAI, but switching providers is a one-line change in run_agent_task. More capable models generally complete tasks in fewer steps and more reliably, while smaller models are cheaper per step.

Keep the API key out of the Actor input and source code. The example reads it from an environment variable, which on the Apify platform you set as a secret environment variable (for example OPENAI_API_KEY), and locally you export in your shell.

Example Actor

The following Actor runs a Browser Use agent for a single task and stores its structured result in the default dataset. By default, it opens Hacker News and returns the title and URL of the top five posts, but the task, model, and step limit are all configurable through the Actor input.

The whole Actor fits in a single file. A run_agent_task helper holds the Browser Use-specific logic: it defines the output schema and builds the LLM, browser, and agent. The main coroutine handles the Actor lifecycle, reads the input, sets up Apify Proxy, runs the agent, and stores the result:

Run on
import asyncio
import os
from urllib.parse import urlsplit

from browser_use import Agent, Browser, ChatOpenAI
from browser_use.browser import ProxySettings
from pydantic import BaseModel

from apify import Actor

# Default task, aligned with the `Posts` schema below.
DEFAULT_TASK = (
'Open https://news.ycombinator.com and return the title and URL '
'of the top 5 posts on the front page.'
)


class Post(BaseModel):
"""A single item the agent is asked to extract."""

title: str
url: str


class Posts(BaseModel):
"""The structured result returned by the agent."""

posts: list[Post]


def to_browser_use_proxy(proxy_url: str) -> ProxySettings:
"""Convert an Apify Proxy URL into Browser Use `ProxySettings`."""
parts = urlsplit(proxy_url)
return ProxySettings(
server=f'{parts.scheme}://{parts.hostname}:{parts.port}',
username=parts.username,
password=parts.password,
)


async def run_agent_task(
task: str,
*,
model: str,
llm_api_key: str,
max_steps: int,
headless: bool = True,
proxy_url: str | None = None,
) -> Posts | None:
"""Run a Browser Use agent for one task and return its structured output."""
# Configure the LLM. Swap `ChatOpenAI` for another provider if needed.
llm = ChatOpenAI(model=model, api_key=llm_api_key)

# Configure the browser, optionally routed through a proxy.
browser = Browser(
headless=headless,
proxy=to_browser_use_proxy(proxy_url) if proxy_url else None,
)

# `output_model_schema` returns a validated `Posts`; signals stay with the Actor.
agent = Agent(
task=task,
llm=llm,
browser=browser,
output_model_schema=Posts,
enable_signal_handler=False,
)

history = await agent.run(max_steps=max_steps)
return history.structured_output


async def main() -> None:
async with Actor:
# Read the Actor input.
actor_input = await Actor.get_input() or {}
task = actor_input.get('task', DEFAULT_TASK)
model = actor_input.get('model', 'gpt-4.1-mini')
max_steps = actor_input.get('maxSteps', 25)

# Read the LLM API key from the environment (set it as a secret on Apify).
llm_api_key = os.environ.get('OPENAI_API_KEY')
if not llm_api_key:
raise RuntimeError('The OPENAI_API_KEY environment variable is not set.')

# Route the browser through Apify Proxy.
proxy_configuration = await Actor.create_proxy_configuration()
proxy_url = await proxy_configuration.new_url() if proxy_configuration else None

Actor.log.info(f'Running the agent (model={model}) for task: {task}')

result = await run_agent_task(
task,
model=model,
llm_api_key=llm_api_key,
max_steps=max_steps,
headless=Actor.configuration.headless,
proxy_url=proxy_url,
)

if result is None:
Actor.log.warning('The agent did not return any structured output.')
return

# Store each extracted item as a dataset row.
Actor.log.info(f'The agent returned {len(result.posts)} post(s); storing them.')
for post in result.posts:
Actor.log.info(f'Storing post: {post.title!r} ({post.url})')
await Actor.push_data(post.model_dump())


if __name__ == '__main__':
asyncio.run(main())

Note that:

  • Keeping the agent setup in run_agent_task separates the Browser Use-specific code from the Actor's orchestration logic. main only decides what to read from the input and what to store.
  • Passing output_model_schema=Posts makes the agent return a validated Posts instance via history.structured_output, so main can push each item straight to the dataset. Adapt the task and the Post/Posts models together to fit your own use case.
  • enable_signal_handler=False leaves signal handling to the Actor, which manages the run's lifecycle. Without it, Browser Use would install its own handlers and interfere with a clean shutdown.
  • headless=Actor.configuration.headless runs the browser without a visible window, which is what you want on the platform.

The example runs one agent per Actor run, so each browser profile stays isolated. If you parallelize tasks within a single Actor, give every agent its own Browser instance with its own user_data_dir. Several concurrent agents sharing one profile can corrupt it.

Using Apify Proxy

Running on the Apify platform gives your agent access to Apify Proxy, which rotates IP addresses to avoid rate limiting and blocking. In the example above, main creates a proxy configuration with Actor.create_proxy_configuration and passes a fresh proxy URL to run_agent_task.

Browser Use expects the proxy as a ProxySettings object with separate server, username, and password fields, whereas ProxyConfiguration.new_url returns a single URL string (for example http://user:pass@proxy.apify.com:8000). The _proxy_settings helper splits that URL into the fields Browser Use expects. To select specific proxy groups or a country, pass the relevant arguments to Actor.create_proxy_configuration. For details, see Proxy management.

Running on the Apify platform

Browser Use drives a real Chromium over CDP, so the Actor needs a browser binary available at runtime. The simplest way to provide one is to build on top of the Apify Playwright base image, which already ships a browser together with all of its system-level dependencies. Browser Use discovers that browser automatically, so no extra install step is needed in the image.

To disable Browser Use's telemetry and cloud sync inside the Actor, set the ANONYMIZED_TELEMETRY=false and BROWSER_USE_CLOUD_SYNC=false environment variables in your Dockerfile.

When running the Actor locally, install the browser once with the browser-use install command, which downloads a Chromium build together with its dependencies:

browser-use install

Remember to provide the LLM API key in both environments: as a secret environment variable on the platform, and exported in your shell when running locally.

Conclusion

In this guide, you learned how to use Browser Use in your Apify Actors. You can now drive a real browser with an LLM agent, return its results as a validated Pydantic model, route the browser through Apify Proxy, and run the whole thing on the Apify platform. To get started with your own automation tasks, see the Actor templates. If you have questions or need assistance, feel free to reach out on our GitHub or join our Discord community. Happy automating!

Additional resources