← Hell World Blog·April 22, 2026·11 min read

Proxy Stack for AI Agents — browser-use, Computer Use, MCP, Operator

What proxy infrastructure actually works for LLM-driven web agents. Why agentic browser traffic looks suspicious to anti-bot, sticky vs rotating session strategy, residential vs ISP, and a sample wiring for browser-use and Anthropic Computer Use.

Maya Chen#ai-agent#browser-use#mcp#automation#residential

We’ve been seeing more agent traffic from our customer base over the last twelve months than the previous five years combined. The shape changed too. Where the typical 2023 customer ran Playwright workers scraping the same five domains, the typical 2026 customer is running an LLM-driven agent that goes wherever its plan tells it to — price comparison across a dozen retailers, research that hops between SEC filings and earnings call transcripts, checkout automation against a long tail of small sites. The proxy decisions for that workload are not the decisions you’d make for a 2023 scraper, and a lot of agent builders are getting them wrong.

This post is the practical version of what we tell customers when they come in saying their agent works locally but breaks in production. We’ll cover why agent traffic looks suspicious to anti-bot stacks, what session strategy actually works, how to pick the pool by target tier, and a few wiring patterns we see customers settle on for browser-use, Anthropic Computer Use, OpenAI Operator, and MCP-style tool servers. None of this is theory — it’s a digest of patterns from our support queue, with references back to the residential, ISP static, mobile, and datacenter products where they fit.

Why agent traffic is anti-bot bait

Drop a fresh LLM agent on a Cloudflare-protected site with a clean residential IP and watch it fail. The IP is fine. The agent is not. The reason is the same reason early Selenium scripts failed in the 2010s: the traffic does not look like a human even when the IP does.

Three patterns trip agents up reliably.

Timing is too consistent. A human has variable dwell time — scan for two seconds, click, scan for eight, scroll, get distracted, come back. An LLM-driven agent has a deterministic loop: receive screenshot or DOM, think for some milliseconds, click, wait for page load, repeat. The intervals between actions cluster tightly around the model’s response latency. Behavioral models pick this up over a session — after ten interactions, the timing histogram doesn’t look human.

The viewport gives it away. browser-use, the Playwright-based library that’s become the de facto open-source agent stack, defaults to headless Chromium with a fixed viewport. Anthropic Computer Use takes screenshots from a virtual display that’s typically 1024x768. OpenAI Operator runs on remote infrastructure with OpenAI’s fingerprint. None match the distribution of viewports anti-bot sees from real users.

Mouse trails are linear. When a Playwright agent clicks a coordinate, the cursor jumps straight to it. Real human cursors arc, overshoot, correct, jitter. PerimeterX and the Akamai sensor layer especially collect mouse movement and score linear teleports badly.

The user-agent is the framework default. browser-use ships with the Playwright UA. Computer Use has its own default. Agents that don’t override these share a fingerprint with thousands of other agents on the same threat-intel feeds. By the time you point one at a real site, the default UA is already on a deny list.

This splits the failure modes. A premium residential IP will not save an agent whose timing and viewport scream automation. A tuned agent with proper stealth still needs a clean IP. Both layers compound. Builders who only invest in one wonder why their success rate is mediocre.

The three-layer detection problem

The proxy is one of three layers, and customers who only think about the proxy are mis-allocating budget.

Layer 1: TLS fingerprint. JA3 and JA4 are hashes of the TLS ClientHello — cipher suite list, extension order, supported groups. Real Chrome produces a specific JA4. Python requests with default urllib3 produces a different one that maps cleanly onto every other Python script in the world. A TLS fingerprint that doesn’t match the user-agent header is a strong automation signal. Proxies don’t change your TLS fingerprint — the proxy is transparent, your client still produces the ClientHello — so this is the agent’s job.

Layer 2: IP class. This is where the proxy lives. Datacenter ASNs (AWS, GCP, the major colos) are flagged elevated risk by default. Residential ASNs from real ISPs score higher. Mobile carriers score highest because the false-positive cost of blocking a carrier IP is too expensive for the site. ISP-class proxies — datacenter-hosted on residential ASNs — sit between.

Layer 3: Behavior. Once the request is in, the session is being scored over time. Mouse movement, click cadence, scroll patterns, time on page. Agents are bad at this layer by default.

The trap: fix only one layer and the other two flag you. Premium residential with a Python TLS fingerprint will fail. Tuned headless Chrome on an AWS IP will fail. The proxy is part of the answer, not the whole answer.

Sticky vs rotating: the agent session problem

This is the biggest configuration mistake we see. Customers come from a scraping background where rotating-per-request is the default, configure their agent that way, and immediately break statefulness.

Agents are stateful. An agent doing a checkout has a cart, a session cookie, a CSRF token, and UI state that all assume the same browser is talking to the server across multiple requests. Rotate the IP between “add to cart” and “go to checkout” and the site invalidates the session — sometimes silently, sometimes by flagging IP-rotate as a hijack attempt. Either way the agent loses progress and retries, which burns bandwidth and ramps anti-bot suspicion.

The right setting for almost every agent workload is sticky sessions. On Hell World residential, sticky is configured via the username — a session token suffix like username-session-abc123 — and holds up to roughly 30 minutes before the upstream IP rotates. ISP static is sticky by definition. Mobile sticky is usually 10-30 minutes depending on pool.

The window matters. Sticky too short and you lose state mid-flow. Sticky too long on residential and the underlying IP rotates upstream regardless, because residential IPs are real ISP connections that cycle. Customers who succeed pick a window slightly longer than their longest task — for a 5-minute checkout, a 10-minute sticky is comfortable. For multi-hour research sessions, ISP static is safer because nothing rotates underneath.

A second pattern works for longer-running agents: rotate per-task, sticky within-task. The agent picks up a task from a queue, gets a fresh sticky session, runs to completion, drops the session. Next task gets a new one. This is the right shape for agent farms where tasks are independent — research, price-checks, monitoring.

Picking pool by target tier

Not every target needs the same proxy. The cost difference between datacenter and mobile is roughly 30x at retail, and customers who default to residential for everything are over-spending on low-friction targets.

Target tier	Examples	Recommended pool	Sticky window
Tier 1 — low anti-bot	Public docs, news sites, search results pages, government open data, Wikipedia	Datacenter or residential-lite	Short or per-request
Tier 2 — Cloudflare-protected SaaS, B2B	Most SaaS dashboards, B2B portals, mid-tier e-commerce, classifieds	Residential	5-10 min
Tier 3 — high-friction, high-value	Banks, ticketing, sneaker sites, social platforms, premium streaming, account-heavy workflows	ISP static dedicated, or 4G mobile	Full session lifetime

Tier 1 is where agent builders waste money. If your agent is researching public information from sites with no anti-bot, a datacenter proxy is fine and an order of magnitude cheaper. The only reason to spend up is geo coverage datacenter doesn’t have.

Tier 2 is the bulk of agent traffic in our queue. Most B2B SaaS uses Cloudflare Bot Management at moderate sensitivity. Residential sticky for 5-10 minutes is the standard answer. Budget pools fail here even when premium pools succeed — we covered why in the anti-bot landscape post: cheap residential recycles IPs fast and the recent abuse history sticks.

Tier 3 is where agents succeed expensively or fail cheaply. Account-heavy flows — anything logged in, anything with a fraud team — punish IP changes brutally. ISP static dedicated is the go-to. You lease a specific IP for the month, the site sees your account always coming from the same address, and you don’t trip “unusual login location” alerts. For sneaker drops or ticketing the pattern is mobile sticky, covered in detail in the sneaker proxy guide.

Wiring patterns

Three configurations we see customers actually deploy, in rough form.

browser-use + residential sticky

browser-use takes a BrowserConfig that accepts a Playwright-style proxy dict:

proxy_user = "your_account-session-task42"
proxy = {
  "server": "http://gate.hellworld.io:7777",
  "username": proxy_user,
  "password": "your_password"
}

The session token in the username (-session-task42) holds the sticky window. Pick the token per task, hold it for the task duration, drop it when the task ends. The agent’s planner doesn’t need to know about the proxy — it sees the page Playwright loads, which is the proxy-routed page. Combine with a Chrome UA and a tuned viewport and you have a working tier-2 stack.

Anthropic Computer Use + ISP dedicated

Computer Use runs in a container with a virtual display. Outbound traffic goes through whatever HTTP/HTTPS proxy is configured at the OS level, typically via HTTP_PROXY and HTTPS_PROXY env vars passed in. For account-heavy workflows — an agent managing a SaaS dashboard, a banking flow — point those env vars at an ISP static proxy you’ve leased for the month.

The advantage over residential is twofold. The IP doesn’t rotate, so the session is stable for as long as the agent runs. And the same IP across multiple runs against the same target builds positive reputation rather than starting cold every time. After a few weeks of legitimate traffic from one ISP IP into a SaaS account, the anti-bot stack has learned that IP plus that account is a normal pair.

MCP server with per-task sticky residential

MCP (Model Context Protocol) is the Anthropic standard for tool servers — agents call tools like fetch_url or search_web exposed by an MCP server. When those tools make outbound HTTP requests, the MCP server holds the proxy config; the agent never sees it.

The pattern: the server takes a task_id from the agent, derives a session token, and routes fetches through username-session-${task_id} on residential. Each task gets its own sticky exit; when the task finishes, the session drops. This is the cleanest separation we see in production stacks — the agent plans, the MCP layer handles network identity, and per-task isolation limits cross-contamination if one IP gets flagged.

Geo matters more than agent builders think

A surprising number of agent builders forget that “research current prices in Germany” requires an exit IP in Germany. They run the agent from a US harness on a US residential IP, the agent searches Google, Google returns google.com results in English with US prices, and the agent dutifully reports those as German prices. The agent doesn’t know any better. The user gets bad output and blames the model.

Hell World residential covers 210 countries with country, state, and city-level targeting. The geo selector goes in the username — username-country-de-session-task42 for a German residential exit. For agents doing localized research, build geo-awareness into the planner: when the task says “in Germany,” the tool call specifies the geo. The 14-brand geo coverage breakdown is in the comparison page if you need to know what we tested.

The cost math

A typical agent task burns 50-300 MB of bandwidth per hour, depending on whether it’s screenshot-heavy (Computer Use) or DOM-heavy (browser-use). At our residential rate of $0.23/GB, an hour of agent work costs single-digit cents. That’s negligible against LLM inference cost.

The number that ramps fast is retries. An agent that hits a CAPTCHA, fails, replans, retries, fails again can easily 5-10x its bandwidth on a single task. Challenge interstitials load images and JS, all of which gets billed. The cost optimization that matters is avoid retries — pick the right pool the first time, hold sticky sessions correctly, tune the agent’s stealth so it doesn’t trip the challenge. Saving $0.05/GB on a budget pool is dwarfed by losing 30% of tasks to retries.

What to monitor

The metrics that catch problems early, in priority order:

Per-target success rate. Bucket tasks by destination domain. A drop in one bucket means that target tightened or your pool’s IPs hit its deny list. Localized signal beats overall success rate every time.

Session lifetime before flag. How many requests does a sticky session handle before getting challenged? If this drops over time, your pool is degrading or your behavior is leaking. If it varies by target, you have target-specific tuning to do.

Exit IP class composition. Sample exits and check ASN distribution. If you bought residential and 20% of exits show datacenter ASNs, your pool is mixing classes — talk to your provider.

What proxies cannot fix

We sell proxies, so we’ll be direct: no proxy fixes a poorly-engineered agent. If the TLS fingerprint is wrong, the IP doesn’t matter. If timing is robotic, the IP doesn’t matter. If the viewport is the framework default, the IP doesn’t matter. Stealth and IP reputation are independent layers, and both have to be right.

We’ve seen budget pools work for tuned agents and premium pools fail for sloppy ones. Spend the first iteration on the agent — get the user-agent, viewport, TLS fingerprint, and timing right. Then add proxies. Order matters because debugging is much faster when you know the agent itself is clean.

Once that’s true, the proxy layer becomes a meaningful lever. Tier 1 work moves to datacenter and saves money. Tier 2 picks the right residential brand and stops failing on Cloudflare. Tier 3 goes to ISP or mobile and stops getting flagged on logged-in flows. The comparison page is the starting point for picking a pool; the residential, ISP, mobile, and datacenter pages have the specifics. Pick by tier, configure sticky correctly, geo-target where the task demands it, monitor production. That’s the whole stack.

Proxy Stack for AI Agents — browser-use, Computer Use, MCP, Operator

Why agent traffic is anti-bot bait

The three-layer detection problem

Sticky vs rotating: the agent session problem

Picking pool by target tier

Wiring patterns

browser-use + residential sticky

Anthropic Computer Use + ISP dedicated

MCP server with per-task sticky residential

Geo matters more than agent builders think

The cost math

What to monitor

What proxies cannot fix

What our customers say

Awesome support and great product

GEOFAST AFFORDABLE AND RELIABLE RESIS

BEST PROXIES

Great service and helpful staff

paypal issue resolved quickly

Great customer service

Awesome support and great product

GEOFAST AFFORDABLE AND RELIABLE RESIS

BEST PROXIES

Great service and helpful staff

paypal issue resolved quickly

Great customer service

Excellent service and the most affordable data out there!

Great Customer service

hell on #1 still

Paypal issue solved quick

PayPal issue resolved swiftly

Payment issues resolved quickly!