Skip to content

How unsurf works

Every SPA is an API client. Behind every web form, every dashboard, every data table — there are HTTP calls to a backend. But these APIs are undocumented, unauthenticated for direct access, and invisible to agents.

Agents today interact with websites by launching browsers, finding DOM elements, clicking buttons, and scraping text. This is slow (10-45 seconds per action), fragile (breaks when CSS changes), and expensive (full browser per operation).

If you watch the network traffic while a browser uses a site, you see the real API. Every fetch() call, every XHR request — that is the typed, structured interface the frontend already uses. The UI is just a wrapper.

unsurf captures that traffic and turns it into a typed API definition.

The scout launches a headless browser via Cloudflare Browser Rendering, navigates to a target URL, and enables Chrome DevTools Protocol (CDP) network capture.

Every network request is intercepted:

  • Network.requestWillBeSent — captures method, URL, headers, body
  • Network.responseReceived + Network.getResponseBody — captures status, headers, response body

The scout then:

  1. Filters to XHR/fetch requests (ignores images, CSS, scripts)
  2. Normalizes URL patterns (/users/123/users/:id)
  3. Groups requests by normalized pattern
  4. Infers a JSON Schema from response bodies per group
  5. Generates an OpenAPI 3.1 specification from all captured endpoints
  6. Saves endpoints to D1 (via Drizzle), HAR logs and screenshots to R2

The worker executes a previously scouted path. It has two strategies:

  • Fast path: If the scout captured a direct API endpoint, the worker replays the HTTP call using Effect’s HttpClient. No browser. Milliseconds.
  • Slow path: If the form requires JavaScript execution, the worker launches a browser and steps through the saved navigation path.

The worker always validates the response against the stored schema.

Websites change. When the worker fails:

  1. Retry with exponential backoff (handles transient failures)
  2. If retries exhaust, re-scout the same URL with the same task
  3. Diff old endpoints against new ones
  4. Update the stored path
  5. Re-execute with the patched path

unsurf is built entirely on Effect, a TypeScript library for building reliable applications. Every operation is an Effect<Success, Error, Dependencies>.

This matters because:

Typed errors. A scout can fail with BrowserError, NetworkError, or StoreError. These are not strings — they are typed values in the error channel:

Error types View source →
import { Schema } from "effect";

export class NetworkError extends Schema.TaggedError<NetworkError>()("NetworkError", {
	url: Schema.String,
	status: Schema.optional(Schema.Number),
	message: Schema.String,
}) {}

export class BrowserError extends Schema.TaggedError<BrowserError>()("BrowserError", {
	message: Schema.String,
	screenshot: Schema.optional(Schema.String),
}) {}

export class PathBrokenError extends Schema.TaggedError<PathBrokenError>()("PathBrokenError", {
	pathId: Schema.String,
	step: Schema.optional(Schema.Number),
	reason: Schema.String,
}) {}

export class StoreError extends Schema.TaggedError<StoreError>()("StoreError", {
	message: Schema.String,
}) {}

export class NotFoundError extends Schema.TaggedError<NotFoundError>()("NotFoundError", {
	id: Schema.String,
	resource: Schema.optional(Schema.String),
}) {}

The heal system uses catchTag to route each error type to the right recovery strategy.

Resource safety. Browser containers are expensive. Effect’s Scope and acquireRelease guarantee cleanup even if the scout crashes mid-operation. No zombie browsers.

Dependency injection. Every service is defined as a Context.Tag and injected via Layer. Here’s the Browser service interface:

Browser service View source →
import { Context, Effect, Layer, Stream } from "effect";
import type { BrowserError } from "../domain/Errors.js";
import type { NetworkEvent } from "../domain/NetworkEvent.js";

export interface BrowserService {
	readonly navigate: (url: string) => Effect.Effect<void, BrowserError>;
	readonly captureNetwork: () => Effect.Effect<
		Stream.Stream<NetworkEvent, BrowserError>,
		BrowserError
	>;
	readonly screenshot: () => Effect.Effect<Uint8Array, BrowserError>;
	readonly evaluate: <T>(fn: () => T) => Effect.Effect<T, BrowserError>;
	readonly close: () => Effect.Effect<void>;
}

export class Browser extends Context.Tag("Browser")<Browser, BrowserService>() {}

export const BrowserStub = Layer.succeed(Browser, {
	navigate: () => Effect.void,
	captureNetwork: () => Effect.succeed(Stream.empty),
	screenshot: () => Effect.succeed(new Uint8Array()),
	evaluate: () => Effect.succeed(undefined as never),
	close: () => Effect.void,
});

In production, Browser.Live connects to Cloudflare’s browser. In tests, Browser.TestLive replays recorded fixtures. The business logic is identical.

Streams. CDP emits hundreds of network events during a session. Effect’s Stream processes them as they arrive — filter, group, transform — without buffering everything in memory.

Retry policies. The heal system uses Schedule.exponential for backoff and Schedule.whileInput to retry only on transient errors. This is declarative, composable, and testable.

unsurf runs entirely on Cloudflare:

ComponentCloudflare ServicePurpose
ComputeWorkersAPI server, MCP tools
BrowserBrowser RenderingHeadless Chrome for scouting
DatabaseD1 (SQLite)Endpoints, paths, run history
Blob storageR2HAR logs, screenshots
Infra configAlchemyTypeScript, not YAML

The web was designed for humans to browse. Agents should not have to pretend to be humans.

Every website already has a typed API underneath its UI. unsurf surfaces it.

surf the web → unsurf it