Skip to main content

Datacenter proxy

Learn how to reduce blocking when web scraping using IP address rotation. See proxy parameters and learn to implement Apify Proxy in an application.


Datacenter proxies are a cheap, fast and stable way to mask your identity online. When you access a website using a datacenter proxy, the site can only see the proxy center's credentials, not yours.

Datacenter proxies allow you to mask and rotate your IP address during web scraping and automation jobs, reducing the possibility of them being blocked. For each HTTP/S request, the proxy takes the list of all available IP addresses and selects the one used the longest time ago for the specific hostname.

You can refer to our blog post for tips on how to make the most out of datacenter proxies.

Features

  • Periodic health checks of proxies in the pool so requests are not forwarded via dead proxies.
  • Intelligent rotation of IP addresses so target hosts are accessed via proxies that have accessed them the longest time ago, to reduce the chance of blocking.
  • Periodically checks whether proxies are banned by selected target websites. If they are, stops forwarding traffic to them to get the proxies unbanned as soon as possible.
  • Ensures proxies are located in specific countries using IP geolocation.
  • Allows selection of groups of proxy servers with specific characteristics.
  • Supports persistent sessions that enable you to keep the same IP address for certain parts of your crawls.
  • Measures statistics of traffic for specific users and hostnames.
  • Allows selection of proxy servers by country.

Datacenter proxy types

When using Apify's datacenter proxies, you can either select a proxy group, or the auto mode. Apify Proxy offers either proxy groups that are shared across multiple customers or dedicated ones.

Shared proxy groups

Each user has access to a selected number of proxy servers from a shared pool. These servers are spread into groups (called proxy groups). Each group shares a common feature (location, provider, speed, etc.).

For a full list of plans and number of allocated proxy servers for each plan, see our pricing. To get access to more servers, you can upgrade your plan in the subscription settings;

Dedicated proxy groups

When you purchase access to dedicated proxy groups, they are assigned to you, and only you can use them. You gain access to a range of static IP addresses from these groups.

This feature is also useful if you have your own pool of proxy servers and still want to benefit from the features of Apify Proxy (like IP address rotation, persistent sessions, and health checking). If you do not have your own pool, the customer support team can set up a dedicated group for you based on your needs and requirements.

Prices for dedicated proxy servers are mainly based on the number of proxy servers, their type, and location. Contact us for more information.

Connecting to datacenter proxies

By default, each proxied HTTP request is potentially sent via a different target proxy server, which adds overhead and could be potentially problematic for websites which save cookies based on IP address.

If you want to pick an IP address and pass all subsequent connections via that same IP address, you can use the session parameter.

Username parameters

The username field enables you to pass various parameters, such as groups, session and country, for your proxy connection.

This parameter is optional. By default, the proxy uses all available proxy servers from all groups you have access to.

If you do not want to specify either groups or session parameters and therefore use the default behavior for both, set the username to auto.

Examples

import { Actor } from 'apify';
import { PuppeteerCrawler } from 'crawlee';

await Actor.init();

const proxyConfiguration = await Actor.createProxyConfiguration();

const crawler = new PuppeteerCrawler({
proxyConfiguration,
async requestHandler({ page }) {
console.log(await page.content());
},
});

await crawler.run(['https://proxy.apify.com/?format=json']);

await Actor.exit();

Session persistence

When you use datacenter proxy with the session parameter set in the username field, a single IP is assigned to the session ID provided after you make the first request.

Session IDs represent IP addresses. Therefore, you can manage the IP addresses you use by managing sessions. [More info]

This IP/session ID combination is persisted and expires 26 hours later. Each additional request resets the expiration time to 26 hours.

If you use the session at least once a day, it will never expire, with two possible exceptions:

  • The proxy server stops responding and is marked as dead during a health check.
  • If the proxy server is part of a proxy group that is refreshed monthly and is rotated out.

If the session is discarded due to the reasons above, it is assigned a new IP address.

To learn more about sessions and IP address rotation, see the proxy overview page.

Examples using sessions

import { Actor } from 'apify';
import { PuppeteerCrawler } from 'crawlee';

await Actor.init();

const proxyConfiguration = await Actor.createProxyConfiguration();

const crawler = new PuppeteerCrawler({
proxyConfiguration,
sessionPoolOptions: { maxPoolSize: 1 },
async requestHandler({ page }) {
console.log(await page.content());
},
});

await crawler.run([
'https://proxy.apify.com/?format=json',
'https://proxy.apify.com',
]);

await Actor.exit();

Examples using standard libraries and languages

You can find your proxy password on the Proxy page of the Apify Console.

The username field is not your Apify username.
Instead, you specify proxy settings (e.g. groups-BUYPROXIES94952, session-123).
Use auto for default settings.

For examples using PHP, you need to have the cURL extension enabled in your PHP installation. See installation instructions for more information.

Examples in Python 2 use the six library. Run pip install six to enable it.

import axios from 'axios';

const proxy = {
protocol: 'http',
host: 'proxy.apify.com',
port: 8000,
// Replace <YOUR_PROXY_PASSWORD> below with your password
// found at https://console.apify.com/proxy
auth: { username: 'auto', password: '<YOUR_PROXY_PASSWORD>' },
};

const url = 'http://proxy.apify.com/?format=json';

const { data } = await axios.get(url, { proxy });

console.log(data);