Scraping agents

Your scraper works.
Your bill doesn't.

Apify retries, SerpAPI 429 storms, Firecrawl pagination loops — ProceedGate stops them before they drain your quota.

loop blocked — 429
// Apify actor: same URL retried 11 times in 60s { "decision": "block", "reason": "loop_detected", "zone": "storm", "iteration_count": 11, "human_reason": "Same URL retried 11× in 60s. Stopping.", "retry_after": 60 }

The problem

Three patterns that burn budgets

Scraping agents fail in predictable ways. The common thread: a transient error triggers a retry that triggers another retry.

Apify

Actor retry loop

A 403 on page load triggers actor.retry(). The next attempt hits the same 403. Default config retries 8 times per request — across 100 URLs that's 800 wasted calls.

SerpAPI

429 storm

Rate limit hit. Agent implements exponential backoff but doesn't track the total number of attempts. After 20 minutes it's still retrying the same query — each attempt counting against your monthly quota.

Firecrawl

Pagination loop

next_page keeps returning the same URL. Agent treats it as new content and scrapes indefinitely. No built-in deduplication means it runs until the budget or the timeout hits.

How ProceedGate stops it

One check before every request

Before your scraper fetches a URL, it calls /check with the action and a hash of the target. ProceedGate tracks how many times that exact hash has been seen in the last 60 seconds.

At ≤5 repeats it's normal operation — allow. At 6–10 it's a gray zone — allowed, but flagged as suspicious. At >10 it's a storm — hard block, 429, retry-after 60.

The whole check runs at the Cloudflare edge in under 50ms. Your scraper doesn't slow down. It just can't loop forever.

first 5 requests — allowed
{ "decision": "allow", "proceed_token": "eyJhbGci...", "zone": "safe", "iteration_count": 3, "credits_remaining": 1994 } // ... iteration 11 { "decision": "block", "reason": "loop_detected", "iteration_count": 11, "retry_after": 60 }

Integration

Three lines in your scraper loop

Works with any scraping framework. The check throws on block — no extra error handling needed if you already handle exceptions.

apify-actor.ts curl / fetch / any HTTP client
const PG_KEY = process.env.PG_KEY;

async function pgCheck(agentId: string, taskHash: string) {
  const res = await fetch('https://governor.proceedgate.dev/v1/check', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${PG_KEY}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({
      agent_id:  agentId,
      task_hash: taskHash,
      action:    'tool_call',
    }),
  });
  if (res.status === 429) throw new Error('loop_detected');
  return res.json(); // { allowed, zone, proceed_token, ... }
}

// inside your actor loop:
for (const url of urls) {
  await pgCheck('apify-scraper', sha256(url)); // throws on block
  const html = await fetch(url);
}

Works with

Any scraping stack

ProceedGate is framework-agnostic. If it can make an HTTP request, it can call /check.

🕷️
Apify
🔍
SerpAPI
🔥
Firecrawl
🦀
Crawlee
🎭
Playwright
💡
BrightData

Comparison

vs generic rate limiting

Rate limiting counts requests per IP or per second. ProceedGate tracks behavioral patterns — the same action repeated against the same target.

Capability Rate limiting ProceedGate
Limits requests per second
Detects same URL retried 11× in 60s
AI evaluation for ambiguous patterns
Per-agent budget cap
Signed proceed token as proof
Per-agent reputation & identity tracking
Webhook alert on loop detection
Works across distributed scrapers
On-chain immutable audit trail (BSC)

Start with 5,000 free checks

No card required. Drop three lines into your scraper and see the first blocked loop in your dashboard.

Get your API key →