screenshot-x402

Screenshots for AI agents

Pay per request with x402. No API keys. No subscriptions.

Built on Cloudflare Browser Rendering, R2, and optional vision analysis (OpenRouter or OpenAI).

Integrate via MCP over HTTP — see the screenshot API reference (GET /docs). Network: base

Quick install (CLI)

Official screenshot-x402-cli — same MCP endpoint and x402 flow as agents, with PNG/JPEG and reports on disk.

npm install -g screenshot-x402-cli

// set up wallet with funded key — never commit
export X402_PRIVATE_KEY=0x... 

//tool 1 - capture a screenshot
screenshot-x402 screenshot --page https://example.com

//tool 2 - analyze a screenshot
screenshot-x402 analyze --page https://example.com

Also: npm · GitHub · skill.md · discovery.json

https://example.com x402
Agent
take_screenshot → PNG

What it does

How it works

screenshot-x402 is a pay per request screenshot API for agents: real browser rendering, optional vision, and x402 settlement. Unlike many subscription screenshot API products, you pay per successful tool call — see pricing and the API reference.

  • Real browser rendering (Cloudflare Browser Rendering / Puppeteer) — not a static fetch of HTML.
  • Full-page or viewport screenshots; PNG or JPEG; optional dark-mode emulation, HiDPI scale, hide elements by CSS.
  • Optional vision analysis on the captured JPEG with your prompt (OpenRouter or OpenAI on the server).
  • Cached delivery via R2 when you set cacheTtl — good for repeat screenshot API calls on the same URL.
  1. Agent calls take_screenshot or analyze_screenshot on the screenshot-x402 MCP endpoint.
  2. Paid tools return HTTP 402 until x402 authorization; the wallet pays per request.
  3. PNG or JPEG returns in the result; optional vision-model text for analysis.

Pricing

Per-request list prices below — no monthly minimum. Compare to subscription screenshot API plans: here you align cost with actual x402 screenshot usage. Same numbers in discovery.json. Expand Request parameters on each card or read the full API reference.

health

Free

Smoke test the Worker, MCP, and x402 network config — no payment.

  • Returns JSON with ok, service name, and x402 network.
Request parameters

Pass an empty JSON object: {}. No fields.

Return data
PathTypeNotes
content[0].type"text"MCP content block discriminator
content[0].textstringJSON string: { ok: boolean, name: string, x402Network: string }

take_screenshot

$0.01

Render a real browser screenshot (PNG or JPEG) of any public https URL with a configurable viewport.

  • Viewport width/height; full-page capture; PNG or JPEG.
  • Optional R2 cache (cacheTtl) for repeat URLs.
  • colorScheme (light / dark / no-preference), deviceScaleFactor 1–3 (Retina-style sharpness).
  • hideSelectors: hide elements by CSS selector before capture.
  • delay (ms) after load for late-rendered UI.
Request parameters
FieldTypeDefaultNotes
urlstring (URL)Must be absolute https://…
widthnumber1920Viewport width, 100–3840
heightnumber1080Viewport height, 100–2160
fullPagebooleanfalseFull scrollable page
delaynumber0Extra wait after load (ms), max 30000
cacheTtlnumber86400Seconds to treat R2 entry as fresh; 0 skips cache reads
format"png" | "jpeg""png"Output image type
colorScheme"light" | "dark" | "no-preference""no-preference"Emulates prefers-color-scheme before navigation
deviceScaleFactornumber1Viewport pixel ratio, 1–3 (Retina / sharpness)
hideSelectorsstring[][]Up to 40 CSS selectors (each ≤500 chars); hidden after load
Return data
PathTypeNotes
content[0].type"image"MCP image content block
content[0].datastringBase64-encoded PNG or JPEG bytes
content[0].mimeType"image/png" | "image/jpeg"Matches requested format
_meta.cachedboolean | undefinedtrue when served from R2 cache
_meta.renderTimeMsnumber | undefinedBrowser capture time when not cached

analyze_screenshot

$0.03

Same capture pipeline as take_screenshot (JPEG), then multimodal vision — describe layout, text, or follow your prompt.

  • Same render options as take_screenshot (viewport, full page, colorScheme, deviceScaleFactor, hideSelectors).
  • Custom prompt; returns model text plus the JPEG image in the MCP result.
Request parameters
FieldTypeDefaultNotes
urlstring (URL)Page to capture
promptstring(short default)Instruction for the vision model
widthnumber1920Viewport width
heightnumber1080Viewport height
fullPagebooleanfalseFull page capture
colorScheme"light" | "dark" | "no-preference""no-preference"Same as take_screenshot
deviceScaleFactornumber1Same as take_screenshot (1–3)
hideSelectorsstring[][]Same as take_screenshot
Return data
PathTypeNotes
content[0].type"image"JPEG screenshot (capture is always JPEG for vision)
content[0].datastringBase64-encoded JPEG bytes
content[0].mimeType"image/jpeg"Fixed for this tool
content[1].type"text"Vision model output
content[1].textstringModel answer to your prompt
_meta.renderTimeMsnumber | undefinedBrowser capture time

FAQ

What is x402?

x402 is an open payment protocol for HTTP: paid tools return 402 Payment Required until the client satisfies a machine-readable payment requirement (e.g. stablecoins). screenshot-x402 uses it for pay-per-request billing.

Do I need an API key?

Callers do not use a screenshot-x402 API key. You pay with a wallet via x402. You still configure your agent (e.g. AGENT_PRIVATE_KEY) — see integration docs.

Is MCP required?

The supported integration is MCP over Streamable HTTP at https://screenshotx402.com/mcp — the primary path for an MCP screenshot tool workflow. You connect with an MCP client to that URL.

How is this different from ScreenshotOne?

ScreenshotOne and similar services are typically API-key + subscription or credit based. screenshot-x402 is a pay per request screenshot API for agents using x402 — no per-caller API key, pricing is per successful tool invocation.

What formats are supported?

take_screenshot returns PNG or JPEG (see format in the API reference). analyze_screenshot uses JPEG for the vision step.

Code

Agent integration
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";
import { withX402Client } from "agents/x402";
import { privateKeyToAccount } from "viem/accounts";

const transport = new StreamableHTTPClientTransport(new URL("https://screenshotx402.com/mcp"));
const mcp = new Client({ name: "agent", version: "1.0.0" });

const account = privateKeyToAccount(process.env.AGENT_PRIVATE_KEY as `0x${string}`);

const paidClient = withX402Client(mcp, {
  account,
  network: "base",
  maxPaymentValue: 100_000n,
});

await paidClient.connect(transport);

const result = await paidClient.callTool(null, {
  name: "take_screenshot",
  arguments: { url: "https://example.com", fullPage: false },
});

TypeScript: withX402Client from agents/x402 with a viem wallet (USDC on base); set AGENT_PRIVATE_KEY.

Other languages: MCP Streamable HTTP client + attach x402 payment headers when paid tools return HTTP 402 — see integration docs.

MCP endpoint
https://screenshotx402.com/mcp
Streamable HTTP. List tools on connect; paid tools return HTTP 402 until payment is satisfied — same flow as x402. Free tool: health.

Discoveryhttps://screenshotx402.com/discovery.json (JSON) lists tools, prices, and links to skill.md, docs, and pricing.

screenshot-x402 · x402 · Model Context Protocol