Browser Extension
Implementation spec for the Chrome Manifest V3 extension. Built with WXT.
Decisions that constrain this: ADR-001 (programmatic injection only), ADR-002 (WXT), ADR-006 (jittered polling), ADR-007 (three write methods), ADR-013 (MAIN world on demand), ADR-014 (shadow-DOM targeting), ADR-015 (MAIN-world hygiene), ADR-016 (WAR default-deny).
Project layout
Section titled “Project layout”extension/├── package.json├── tsconfig.json├── vitest.config.ts├── wxt.config.ts├── README.md├── scripts/│ ├── assert-build.ts│ └── smoke/│ ├── command.ts│ ├── daemon.ts│ ├── fixture-server.ts│ ├── fixture.html│ └── workflow.ts└── src/ ├── entrypoints/ │ ├── background.ts │ ├── content.ts │ └── popup/ │ ├── index.html │ ├── main.ts │ └── pairing.ts ├── background/ │ ├── browser-action-interstitials.ts │ ├── browser-action-support.ts │ ├── browser-actions.ts │ ├── dedupe.ts │ ├── dispatcher.ts │ ├── forwarded-actions.ts │ ├── forwarded-params.ts │ ├── forwarded-request.ts │ ├── injection.ts │ ├── main-world*.ts │ ├── responses.ts │ ├── storage*.ts │ ├── tabs*.ts │ ├── trace.ts │ └── ws-client.ts ├── content/ │ ├── actions/ │ │ ├── fill.ts │ │ ├── interactions.ts │ │ ├── reads.ts │ │ ├── scroll-wait.ts │ │ └── select.ts │ ├── discovery.ts │ ├── dom-helpers.ts │ ├── events.ts │ ├── page-state.ts │ ├── polling.ts │ ├── read-tree.ts │ ├── rpc.ts │ └── targeting.ts └── test/ ├── fakes/ ├── fixtures/ └── setup-chrome-storage.tsDeliberate WXT layout deviations
Section titled “Deliberate WXT layout deviations”srcDir: "src"is enabled so dependency-cruiser and knip scan the real extension sources underextension/src/**.- The popup uses a directory entrypoint:
src/entrypoints/popup/index.html+main.ts. WXT 0.20 rejects flatpopup.html+popup.tssiblings with the same basename, so the source layout differs from the original phase sketch even though the emitted manifest still points atpopup.html. - The runtime content script is bundled by WXT but not declared in the manifest. The background service worker injects it programmatically on first use.
Output: extension/.output/chrome-mv3/.
WXT configuration
Section titled “WXT configuration”extension/wxt.config.ts is part of the security contract, not just build plumbing.
import { defineConfig } from "wxt";
export default defineConfig({ srcDir: "src", vite: () => ({ build: { sourcemap: true, modulePreload: false, }, }), manifest: { name: "bproxy", description: "Browser proxy companion extension for bproxy daemon.", permissions: ["tabs", "scripting", "webNavigation", "alarms", "storage"], host_permissions: ["<all_urls>"], action: { default_title: "bproxy", default_popup: "popup.html", }, }, hooks: { "build:manifestGenerated": (_wxt, manifest) => { if (Array.isArray(manifest.content_scripts) && manifest.content_scripts.length === 0) { delete manifest.content_scripts; } if ( Array.isArray(manifest.web_accessible_resources) && manifest.web_accessible_resources.length === 0 ) { delete manifest.web_accessible_resources; } }, },});Locked implications:
- no declarative
content_scripts(ADR-001); - no
web_accessible_resourcesby default (ADR-016); - no
debuggerpermission in the shipped manifest; debugger-backed screenshots remain future opt-in only; - source maps are preserved in production output to keep service-worker and content-script failures diagnosable;
- Vite’s modulepreload polyfill is disabled because it injects
MutationObserver, which would violate ADR-006.
Runtime shape
Section titled “Runtime shape”Background service worker
Section titled “Background service worker”Entrypoint: src/entrypoints/background.ts
Owns:
- bootstrap storage lookup and validation;
- daemon WebSocket connection, reconnect, heartbeat, badge state, and top-level navigation push messages;
- forwarded-request parsing (
BproxyForwardedRequest); - exactly-once execution via dedupe + replay-safe responses;
- extension trace ring buffer for
debug.log; - browser-API actions (
navigate,screenshot,require-human,tab.*); - programmatic content-script injection and DOM-action RPC;
- one-shot MAIN-world execution for
fill(method="runtime-api").
Authentication uses WebSocket subprotocols:
bproxy.v1auth.{base64url(extensionToken)}
Connection state is surfaced via the action badge:
- empty / transparent = disconnected or connected idle;
…= connecting;!= startup or transport error.
Popup pairing UI
Section titled “Popup pairing UI”Entrypoint: src/entrypoints/popup/index.html + main.ts
Flow:
- user enters the one-time pairing code;
- popup
POSTs{ code }tohttp://127.0.0.1:9615/pair/claim; - popup validates the bootstrap payload shape:
extensionTokennon-empty stringwsUrlloopbackws://protocolVersion === 1expiresAt > Date.now()noncepresent
- popup stores the bootstrap payload as one atomic record in
chrome.storage.local; - popup sends
pair.completeto the background worker so reconnect happens immediately.
Validation failures surface distinct popup-side error codes (INVALID_PAYLOAD_SHAPE, INVALID_WS_URL, UNSUPPORTED_PROTOCOL_VERSION, BOOTSTRAP_EXPIRED, MISSING_NONCE, PAIR_TRANSPORT_ERROR, PAIR_NOTIFY_FAILED) in addition to daemon pass-throughs (PAIRING_CODE_INVALID, PAIRING_CODE_EXPIRED, PAIRING_CODE_CONSUMED, PAIRING_RATE_LIMITED).
Runtime content script
Section titled “Runtime content script”Entrypoint: src/entrypoints/content.ts
The content script is registered with:
export default defineContentScript({ registration: "runtime", matches: ["<all_urls>"], runAt: "document_idle", world: "ISOLATED",});The service worker injects it with chrome.scripting.executeScript on first command per tab. The content side keeps a single chrome.runtime.onMessage listener and returns typed success/error envelopes plus page-state snapshots.
Storage schema
Section titled “Storage schema”src/background/storage.ts defines the typed storage items.
| Key | Scope | Purpose |
|---|---|---|
local:bootstrap | local | Pairing bootstrap payload { extensionToken, wsUrl, protocolVersion, issuedAt, expiresAt, nonce } |
local:configFlags | local | Future opt-in flags such as debuggerScreenshot |
session:pins | session | Reserved tab-pin map storage seam |
session:dedupe | session | Request-id → cached response + timestamp |
session:injectedTabs | session | Tabs already injected with the runtime content script |
session:trace | session | Bounded extension trace ring buffer |
Important contract: bootstrapItem is written and read as one record, never as separate token/url/version keys.
Wire contract with the daemon
Section titled “Wire contract with the daemon”The extension parses forwarded daemon messages:
type BproxyForwardedRequest<A extends Action = Action> = BproxyRequest<A> & { target: { tabId: number };};Implications:
- the daemon remains the source of truth for
session → tabId; - the extension does not re-resolve session state;
session.*,debug.last, anddebug.statusstay daemon-local and must not have extension handlers;debug.logis forwarded and served from the extension ring buffer.
Responses are the normal shared BproxyResponse envelope; successful responses include page state and replay. In the other direction, the background worker may push { type: "navigation", tabId, url, cause: "committed" | "history_state" } over the existing WS connection whenever a top-level navigation event is observed.
Action handling
Section titled “Action handling”DOM actions in ISOLATED world
Section titled “DOM actions in ISOLATED world”Handled through src/content/** and routed via background/content RPC.
| Action | Notes |
|---|---|
text, links, images, elements, outline, dom | Read-only DOM extraction; links returns structured URLs, traverses open shadow roots, and can filter to visible/in-viewport anchors |
inspect | Computed-style and layout inspection for specific selectors (rect, display, descendants, scroll info) |
snapshot | Accessible DOM tree serialization (text-based, depth-limited, optional interactive-only mode) |
scroll, wait | Jittered polling only; no MutationObserver. scroll targets only the viewport/document by default or an explicit agent-supplied ElementTarget; it never infers scroll containers. |
click | Explicit target-only activation. Focuses when possible, dispatches honest click-shaped activation, and reports disappearance/stability. |
hover | Explicit target-only hover primitive. Dispatches honest hover-shaped events at the element center and reports completion/stability. |
fill(method="direct") | Native DOM state write, no events |
fill(method="paste") | Dispatches beforeinput/input with inputType: "insertFromPaste" plus change; no synthetic key events |
fill-form | Multi-field isolated-world writes with hidden-field guard and read-back verification |
select | Trigger + poll + option click + verification |
MAIN-world one-shot actions
Section titled “MAIN-world one-shot actions”Handled in src/background/main-world*.ts.
| Action | Notes |
|---|---|
fill(method="runtime-api", world="main") | Exactly one chrome.scripting.executeScript({ world: "MAIN" }) call per request |
MAIN-world injected functions must:
- resolve only the provided target/route;
- catch and normalize errors inside the injected function;
- return plain data only;
- contain no identifying literals such as extension ids,
chrome-extension, package names, or bproxy branding; - install no persistent listeners or globals.
Browser-API actions in the background
Section titled “Browser-API actions in the background”Handled in src/background/browser-actions.ts.
| Action | Notes |
|---|---|
navigate | chrome.tabs.update + wait for top-level load + interstitial detection → HUMAN_REQUIRED |
screenshot | chrome.tabs.captureVisibleTab normal path |
screenshot(debugger=true) | currently returns DEBUGGER_DISABLED unless a future explicit opt-in ships with permission + flag wiring |
tab.list | not forwarded — daemon resolves from session tab registry without extension involvement |
tab.open, tab.close, tab.pin, tab.unpin | Chrome tabs API only; does not take ownership of daemon session state |
require-human | returns structured HUMAN_REQUIRED for daemon pause handling |
Targeting and discovery
Section titled “Targeting and discovery”src/content/targeting.ts and src/content/discovery.ts implement the route-based targeting contract from ADR-014.
interface ElementRoute { hosts: Array<{ selector: string; index?: number }>; target: string;}
type ElementTarget = | { selector: string; route?: never } | { selector?: never; route: ElementRoute };Discovery rules:
- open shadow roots are supported;
- closed shadow roots are out of scope and return honest target errors;
- probing is intent-scoped (active element chain, dialogs/popovers, viewport/hit-test roots, scoped subtree);
- runtime editor handles are probed only inside the candidate root, never via whole-page recursive scans.
Polling and page state
Section titled “Polling and page state”src/content/polling.ts provides the shared wait/stability primitive.
Rules:
- jittered intervals, not fixed cadence;
- bounded timeout;
- visibility-aware bail-out for destructive actions (
TAB_NOT_VISIBLEwhen hidden); - no
MutationObserverin source or built output.
src/content/page-state.ts normalizes page snapshots into the shared PageState envelope:
interface PageState { url: string; title: string; state: "loading" | "ready" | "error"; busy: boolean;}The busy heuristic checks for [aria-busy="true"], active <progress>, and pending navigations — but requires that matched elements are visible (via checkVisibility()). Hidden or off-screen busy indicators (common on Google SERPs) do not trigger false-positive busy: true.
Dedupe and observability
Section titled “Dedupe and observability”Dedupe
Section titled “Dedupe”src/background/dedupe.ts caches prior responses by request id:
interface DedupeEntry { response: BproxyResponse; ts: number;}The store is bounded by size and TTL so daemon replay-on-reconnect does not re-run destructive requests.
Trace ring buffer
Section titled “Trace ring buffer”src/background/trace.ts stores bounded trace entries for debug.log.
interface TraceEntry { id: string; action: string; tab: number; timestamp: number; elapsed: number; result: "ok" | "error"; errorCode?: string; replay: boolean; extensionVersion: string;}The extensionVersion stamp makes stale-build traces visible after extension reloads.
Security and exposure hygiene
Section titled “Security and exposure hygiene”- Programmatic injection only. No default content script presence.
- ISOLATED world by default. Reads plus
scroll/click/hoveranddirect/pastewrites stay out of MAIN world. - MAIN world is one-shot.
runtime-apifill executes through a singlechrome.scripting.executeScript({ world: "MAIN" })call. - Default-deny WAR. No
web_accessible_resourcesare shipped. - No default debugger surface. The manifest omits the
debuggerpermission. - No
MutationObserver. The extension uses jittered polling instead. - Bootstrap secrecy. Long-lived auth material is kept in
chrome.storage.local; per-session caches live inchrome.storage.session.
Testing and local verification
Section titled “Testing and local verification”Automated coverage lives under:
src/background/__tests__src/content/__tests__src/entrypoints/popup/__tests__
Locked design assertions include:
- no
MutationObserverin the production bundle; - manifest contains no declarative
content_scriptsor defaultweb_accessible_resources; - paste writes dispatch the expected paste-flavored events and no key events;
- MAIN-world actions use
executeScript({ world: "MAIN" })only on demand; - duplicate request ids reply from cache rather than executing twice;
- production artifacts preserve source maps and useful startup crash labels.
Local smoke helpers live under scripts/smoke/ and exercise the real daemon + real Chrome pairing flow on localhost.
Development
Section titled “Development”pnpm --filter @bproxy/extension devpnpm --filter @bproxy/extension buildpnpm --filter @bproxy/extension typecheckpnpm --filter @bproxy/extension testSee extension/README.md for the end-to-end smoke workflow and local loading instructions.