Skip to content

Ecosystem

WebSketch is not a single tool. It is a family of packages built on a shared grammar and a common intermediate representation (IR). Every tool in the family produces the same WebSketchCapture output, so captures are interchangeable between pipelines.

PackageWhat it does
websketch-irThe core IR library. Defines the grammar, validation, rendering, diffing, and fingerprinting. Every other package depends on this.
websketch-vscode (this repo)VS Code extension. Capture pages from your editor, browse four views, copy or export for LLMs.
websketch-cliCommand-line capture and rendering. Scriptable, CI-friendly, works in pipelines.
websketch-extensionChrome browser extension. Capture the page you are looking at without leaving the browser.
websketch-mcpMCP server for LLM agent integration. Gives AI agents the ability to capture and reason about web pages as part of their tool chain.

All five tools produce a WebSketchCapture object. This means:

  • A capture from the VS Code extension can be opened in the CLI.
  • A capture from the Chrome extension can be fed to the MCP server.
  • A capture from the CLI can be diffed against a capture from any other tool.
  • JSON exports are identical regardless of which tool produced them.

The IR includes the semantic tree, bounding boxes, content hashes, viewport metadata, and timing information. The websketch-ir package provides the schema, validators, and renderers that all other packages use.

MetricWebSketchRaw HTMLScreenshot
Tokens200—80050,000+1,000+ (vision)
StructureFull semantic treeNested div chaosPixel grid
Text contentQuoted and labeledBuried in markupOCR-dependent
Interactive elementsMarked with *Hidden in attributesInvisible
Heading hierarchy<h1> through <h6> preservedLost in class namesGuessed from font size
Landmarks<main>, <nav>, <search>Requires DOM expertiseNot available
Works withAny text LLMNothing useful at that token countVision models only

The key advantage is density. WebSketch delivers the same semantic information in two orders of magnitude fewer tokens, and it works with text models that have no vision capability at all.

  • Interactive exploration — Use the VS Code extension or Chrome extension. Both give you a visual panel with four views.
  • CI/CD and scripts — Use the CLI. It accepts URLs on stdin, outputs to stdout, and returns proper exit codes.
  • AI agent pipelines — Use the MCP server. It exposes capture as a tool that any MCP-compatible agent can call.
  • Building your own tools — Use the IR library directly. It gives you the grammar, validators, renderers, and differ without any capture dependencies.

Back to landing page