> pinchtab
Control a headless or headed Chrome browser via Pinchtab's HTTP API for web automation, scraping, form filling, navigation, screenshots, and extraction with stable accessibility refs.
curl "https://skillshub.wtf/mxyhi/ok-skills/pinchtab?format=md"Pinchtab
Fast, lightweight browser control for AI agents via HTTP + accessibility tree.
Security Note: Pinchtab runs entirely locally. It does not contact external services, send telemetry, or exfiltrate data. However, it controls a real Chrome instance — if pointed at a profile with saved logins, agents can access authenticated sites. Always use a dedicated empty profile and set BRIDGE_TOKEN when exposing the API. See TRUST.md for the full security model.
Quick Start (Agent Workflow)
The 30-second pattern for browser tasks:
# 1. Start Pinchtab (runs forever, local on :9867)
pinchtab &
# 2. In your agent, follow this loop:
# a) Navigate to a URL
# b) Snapshot the page (get refs like e0, e5, e12)
# c) Act on a ref (click e5, type e12 "search text")
# d) Snapshot again to see the result
# e) Repeat step c-d until done
That's it. Refs are stable—you don't need to re-snapshot before every action. Only snapshot when the page changes significantly.
Recommended Secure Setup
# Best practice for AI agents
BRIDGE_BIND=127.0.0.1 \
BRIDGE_TOKEN="your-strong-secret" \
BRIDGE_PROFILE=~/.pinchtab/automation-profile \
pinchtab &
Never expose to 0.0.0.0 without a token. Never point at your daily Chrome profile.
Setup
# Headless (default) — no visible window
pinchtab &
# Headed — visible Chrome window for human debugging
BRIDGE_HEADLESS=false pinchtab &
# With auth token
BRIDGE_TOKEN="your-secret-token" pinchtab &
# Custom port
BRIDGE_PORT=8080 pinchtab &
Default: port 9867, no auth required (local). Set BRIDGE_TOKEN for remote access.
For advanced setup, see references/profiles.md and references/env.md.
What a Snapshot Looks Like
After calling /snapshot, you get the page's accessibility tree as JSON—flat list of elements with refs:
{
"refs": [
{"id": "e0", "role": "link", "text": "Sign In", "selector": "a[href='/login']"},
{"id": "e1", "role": "textbox", "label": "Email", "selector": "input[name='email']"},
{"id": "e2", "role": "button", "text": "Submit", "selector": "button[type='submit']"}
],
"text": "... readable text version of page ...",
"title": "Login Page"
}
Then you act on refs: click e0, type e1 "user@pinchtab.com", press e2 Enter.
Core Workflow
The typical agent loop:
- Navigate to a URL
- Snapshot the accessibility tree (get refs)
- Act on refs (click, type, press)
- Snapshot again to see results
Refs (e.g. e0, e5, e12) are cached per tab after each snapshot — no need to re-snapshot before every action unless the page changed significantly.
Quick examples
pinchtab nav https://pinchtab.com
pinchtab snap -i -c # interactive + compact
pinchtab click e5
pinchtab type e12 hello world
pinchtab press Enter
pinchtab text # readable text (~1K tokens)
pinchtab text | jq .text # pipe to jq
pinchtab ss -o page.jpg # screenshot
pinchtab eval "document.title" # run JavaScript
pinchtab pdf --tab TAB_ID -o page.pdf # export PDF
For the full HTTP API (curl examples, download, upload, cookies, stealth, batch actions, PDF export with full parameter control), see references/api.md.
Token Cost Guide
| Method | Typical tokens | When to use |
|---|---|---|
/text | ~800 | Reading page content |
/snapshot?filter=interactive | ~3,600 | Finding buttons/links to click |
/snapshot?diff=true | varies | Multi-step workflows (only changes) |
/snapshot?format=compact | ~56-64% less | One-line-per-node, best efficiency |
/snapshot | ~10,500 | Full page understanding |
/screenshot | ~2K (vision) | Visual verification |
/tabs/{id}/pdf | 0 (binary) | Export page as PDF (no token cost) |
Strategy: Start with ?filter=interactive&format=compact. Use ?diff=true on subsequent snapshots. Use /text when you only need readable content. Full /snapshot only when needed.
Agent Optimization
Validated Feb 2026: Testing with AI agents revealed a critical pattern for reliable, token-efficient scraping.
See the full guide: docs/agent-optimization.md
Quick Summary
The 3-second pattern — wait after navigate before snapshot:
curl -X POST http://localhost:9867/navigate \
-H "Content-Type: application/json" \
-d '{"url": "https://pinchtab.com"}' && \
sleep 3 && \
curl http://localhost:9867/snapshot | jq '.nodes[] | select(.name | length > 15) | .name'
Token savings: 93% reduction (3,842 → 272 tokens) when using prescriptive instructions vs. exploratory agent approach.
For detailed findings, system prompt templates, and site-specific notes, see docs/agent-optimization.md.
Tips
- Always pass
tabIdexplicitly when working with multiple tabs - Refs are stable between snapshot and actions — no need to re-snapshot before clicking
- After navigation or major page changes, take a new snapshot for fresh refs
- Pinchtab persists sessions — tabs survive restarts (disable with
BRIDGE_NO_RESTORE=true) - Chrome profile is persistent — cookies/logins carry over between runs
- Use
BRIDGE_BLOCK_IMAGES=trueor"blockImages": trueon navigate for read-heavy tasks - Wait 3+ seconds after navigate before snapshot — Chrome needs time to render 2000+ accessibility tree nodes
> related_skills --same-repo
> yeet
Use only when the user explicitly asks to stage, commit, push, and open a GitHub pull request in one flow using the GitHub CLI (`gh`).
> xlsx
Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my
> vercel-react-best-practices
React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
> test-driven-development
Use when implementing any feature or bugfix, before writing implementation code