Scraping agents

Your scraper works.
Your bill doesn't.

Apify retries, SerpAPI 429 storms, Firecrawl pagination loops — ProceedGate stops them before they drain your quota.

Start free — no card required See the code →

loop blocked — 429

// Apify actor: same URL retried 11 times in 60s { "decision": "block", "reason": "loop_detected", "zone": "storm", "iteration_count": 11, "human_reason": "Same URL retried 11× in 60s. Stopping.", "retry_after": 60 }

The problem

Three patterns that burn budgets

Scraping agents fail in predictable ways. The common thread: a transient error triggers a retry that triggers another retry.

Apify

Actor retry loop

A 403 on page load triggers actor.retry(). The next attempt hits the same 403. Default config retries 8 times per request — across 100 URLs that's 800 wasted calls.

SerpAPI

429 storm

Rate limit hit. Agent implements exponential backoff but doesn't track the total number of attempts. After 20 minutes it's still retrying the same query — each attempt counting against your monthly quota.

Firecrawl

Pagination loop

next_page keeps returning the same URL. Agent treats it as new content and scrapes indefinitely. No built-in deduplication means it runs until the budget or the timeout hits.

How ProceedGate stops it

One check before every request

Before your scraper fetches a URL, it calls /check with the action and a hash of the target. ProceedGate tracks how many times that exact hash has been seen in the last 60 seconds.

At ≤5 repeats it's normal operation — allow. At 6–10 it's a gray zone — allowed, but flagged as suspicious. At >10 it's a storm — hard block, 429, retry-after 60.

The whole check runs at the Cloudflare edge in under 50ms. Your scraper doesn't slow down. It just can't loop forever.

first 5 requests — allowed

{ "decision": "allow", "proceed_token": "eyJhbGci...", "zone": "safe", "iteration_count": 3, "credits_remaining": 1994 } // ... iteration 11 { "decision": "block", "reason": "loop_detected", "iteration_count": 11, "retry_after": 60 }

Integration

Three lines in your scraper loop

Works with any scraping framework. The check throws on block — no extra error handling needed if you already handle exceptions.

apify-actor.ts curl / fetch / any HTTP client

const PG_KEY = process.env.PG_KEY;

async function pgCheck(agentId: string, taskHash: string) {
  const res = await fetch('https://governor.proceedgate.dev/v1/check', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${PG_KEY}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({
      agent_id:  agentId,
      task_hash: taskHash,
      action:    'tool_call',
    }),
  });
  if (res.status === 429) throw new Error('loop_detected');
  return res.json(); // { allowed, zone, proceed_token, ... }
}

// inside your actor loop:
for (const url of urls) {
  await pgCheck('apify-scraper', sha256(url)); // throws on block
  const html = await fetch(url);
}

Works with

Any scraping stack

ProceedGate is framework-agnostic. If it can make an HTTP request, it can call /check.

🕷️

Apify

🔍

SerpAPI

🔥

Firecrawl

🦀

Crawlee

🎭

Playwright

💡

BrightData

Comparison

vs generic rate limiting

Rate limiting counts requests per IP or per second. ProceedGate tracks behavioral patterns — the same action repeated against the same target.

Capability	Rate limiting	ProceedGate
Limits requests per second	✓	✓
Detects same URL retried 11× in 60s	—	✓
AI evaluation for ambiguous patterns	—	✓
Per-agent budget cap	—	✓
Signed proceed token as proof	—	✓
Per-agent reputation & identity tracking	—	✓
Webhook alert on loop detection	—	✓
Works across distributed scrapers	—	✓
On-chain immutable audit trail (BSC)	—	✓

Your scraper works.Your bill doesn't.