Session Model

Why a session model

BUA is an internal tool. Sessions are bookkeeping for debugger-attach lifecycle + audit, not consent tokens. There is no TTL, no approval step, no “Trust this site” flow. A session exists on a tab from the moment the agent first acts there, and ends when the lifecycle says so (tab closed, user idle, user-stopped, blocklisted).

The extension still enforces one hard boundary: a domain blocklist, checked before every action. Everything else is recorded (audit log) but not gated.

Session shape

interface ActiveSession {
  tabId: number; // primary key
  domain: string; // current domain (updated silently on navigation)
  startedAt: number; // epoch ms of creation
  lastActionAt: number; // epoch ms of most recent action dispatch
  actionCount: number;
}

Persisted in chrome.storage.local under activeSessions. Writes are serialized via an in-memory promise chain (withSessionWriteLock) — chrome.storage.local has no CAS, so concurrent mutations would clobber each other without the lock.

Session identity

Sessions are keyed by tabId, not by domain. A single domain can host multiple concurrent sessions when the site is open on more than one tab. This is intentional: it lets the agent work on its own isolated tab (in the BUA window) while the user continues browsing the same domain on their own tab.

CDP commands target a specific tabId, so actions on one session never bleed into the other. The popup lists every active session with its tabId so the user can stop one without affecting siblings on the same domain.

Disambiguation when the agent omits `tabId`

Most actions (click, type, extract, …) accept an optional tabId. If it’s omitted, the extension picks in this order:

Any BUA-owned session (the agent’s own tabs are always safe targets).
A user-owned session only if it’s the one and only candidate.
Otherwise, fails with session_not_found listing the candidates so the agent retries with an explicit tabId.

Prompt guidance tells the agent to track tabId from open_tab’s return value and pass it on every follow-up. Omitting tabId is a convenience for the single-session case, not a guessing game.

How sessions are created

Two paths, both without user approval:

Path	How	Landing tab
Auto — first action	Agent sends any action for a tab that has no session yet	That tab (wherever it lives)
Explicit — `open_tab`	Browser subagent calls `open_tab(url)`	New tab in the BUA window by default

Auto-creation on first action

When the agent dispatches an action for tabId=T on domain=D, the action-dispatcher flow is:

Resolve tabId + domain (from the action payload + live tab state).
Check isDomainBlocked(domain) — if hit, return domain_blocked (terminal), fire domain_blocked WS event, no session created.
Call ensureSession(tabId, domain):
- No session yet → startSession + SessionEvent { type: "start" } → fires session_started WS event.
- Session exists for same domain → return existing.
- Session exists for different domain (tab navigated mid-session) → silently update session.domain to the new value, no event, no WS churn. Audit log records every action with current URL anyway.
Acquire per-tab lock.
Attach chrome.debugger (if not already).
Execute CDP action.
recordAction(tabId) — bump actionCount + lastActionAt.

The whole “consent” step from v3 is gone. Authorization is implicit; accountability is in the audit log.

`open_tab` (agent-initiated)

Agent calls browser({ action: { type: "open_tab", url: "https://…" } }). The dispatcher:

Checks the domain blocklist (same rule as auto-create).
Creates a new tab in the BUA window (default: minimized, unfocused, hidden from user). focus: true overrides to put the tab in the user’s focused window — use sparingly, only when the user explicitly asked to “show me this”.
Calls startSession(domain, newTabId) → fires session_started WS event.
Returns { tabId, windowId, domain } — subagent keeps tabId for all follow-up actions.

The BUA window

The BUA window is a dedicated Chrome window the extension creates for agent-initiated browsing. It exists so the agent can act on logged-in pages without disturbing the user’s main browser.

Invariants:

Hidden by default — created with chrome.windows.create({ focused: false, state: "minimized" }). Never steals focus; the user opts in to watching via the popup’s Show button.
Lazy creation — only spun up on the first agent-initiated open_tab. If the agent never opens its own tab, no BUA window exists.
Auto-collapse — when the last BUA-owned tab closes, the window is removed so abandoned minimized windows don’t pile up.
Id persisted — the window id lives in chrome.storage.local (buaWindowId) to survive service-worker restarts; a chrome.windows.onRemoved listener forgets it if the user manually closes the window.

Routing rules:

Agent open_tab → BUA window, tab opened with active: false, session auto-created.
Agent open_tab({ focus: true }) → user’s focused window with active: true. Use only when the user explicitly asked to see the result.

How sessions end

Six termination reasons, one cleanup path. All paths converge on endSessionByTab(tabId, reason) or endAllSessions(reason), which fires a SessionEvent { type: "end" } → auto-forwarded as session_ended WS event.

Reason	Trigger
`tab_closed`	`chrome.tabs.onRemoved` fires — the normal one-off task path
`user_stopped`	Popup per-session “Stop now”, OR Chrome yellow-bar “Cancel” → `onDetach`
`system_idle`	`chrome.idle` reports 30 min of no input machine-wide
`screen_locked`	`chrome.idle` reports the screen was explicitly locked
`domain_blocked`	User added this domain to the blocklist → `endAllSessionsForDomain` fans out
`global_stop`	Popup red “Stop all” → `endAllSessions` fans out
`extension_reload`	SW was unloaded / extension updated; not usually surfaced — sessions reconcile

All six share the same cleanup side-effects: detach chrome.debugger (idempotent), close any BUA-owned host tab so abandoned minimized tabs don’t pile up, collapse the BUA window if it’s now empty, reject any in-flight actions on that tab with session_not_found.

The two user-presence reasons — system_idle and screen_locked — are split deliberately rather than collapsed into one. They represent different security postures and deserve distinct audit entries:

screen_locked — the user actively secured the machine. Ongoing agent control under the locked state would be running under a different trust context than when the agent was started; the extension revokes immediately on chrome.idle.onStateChanged("locked").
system_idle — no input observed on the whole machine for the detection interval (30 min default). Softer signal: the user may have stepped away, or be reading something without typing. Revocation bounds “forgotten session” risk without being aggressive about short pauses.

Both fire via the same idle-session-guard module; both take the same cleanup path as any other termination; the audit log preserves which one fired so oncall / UI can surface the specific remediation hint.

Debugger-detach bug class, now closed

chrome.debugger.onDetach can fire for reasons outside extension control (user clicks yellow-bar Cancel, Chrome revokes the attachment, DevTools takes over). In v3, handleDebuggerDetached called endSessionByTab but did NOT clear the extension’s in-memory attached: Set<number>. A subsequent attach() attempt would see the stale set, early-return as “already attached”, and all CDP commands would silently hang. The v4 implementation always calls debuggerController.detach(tabId) first on this path — detach() is idempotent, so double-calling is safe.

User-visible surfaces

Two surfaces keep the user in control at all times: Chrome’s native debugger bar (system-level, non-suppressible) and the extension popup (activity monitor + global kill). Both trigger the same cleanup path in the background.

The popup’s red “Stop all” button ends every active session on every tab at once and fires a global_stop WS event so the backend can short-circuit any in-flight requests even before the per-session session_ended events arrive.

Per-session Stop buttons remain available in the popup’s session list for surgical revocation. Both paths go through sessionManager.endSessionByTab / endAllSessions → same SessionEvent bus → same WS forwarding.

No confirm step — the user already decided.

Chrome’s debugger bar

Whenever the extension attaches chrome.debugger to a tab, Chrome injects a yellow banner that reads “Zapvol Browser Bridge is debugging this browser” with a Cancel button. This is not suppressible — Chrome intentionally makes it loud to warn users that a debugger is reading / writing inputs.

Clicking Cancel triggers chrome.debugger.onDetach, which the extension handles by first clearing the in-memory attached state and then ending the session with reason: "user_stopped" — same cleanup path as popup’s Stop.

Every BUA tool in the industry with local-browser access (Anthropic’s computer-use demo, Browser Use’s local mode, any extension-based automation) deals with the same banner. It is the cost of doing real-input automation on the user’s own profile. For internal deployments, IT can hide it via Chrome’s --silent-debugger-extension-api launch flag at the OS level; this is a deployment concern, not an extension one.

Error-handling contract (agent side)

The agent treats the following as terminal for the current action — no silent retries:

Error code	What it means	What the agent should do
`domain_blocked`	Target domain is on the blocklist	Stop this plan; do not retry on a different tab or route through a different action
`session_not_found`	No active session on resolved target, or multi-session ambiguity	Pass `tabId` explicitly; otherwise the extension refuses to guess
`tab_not_found`	Tab was closed	Stop or try another tab (`get_tabs` first)
`element_not_found`	`selector` or live `uid` matched nothing on the current DOM	Re-`extract` the page; pick a different target; don’t retry the same one
`element_stale`	Supplied `uid` no longer cached (page navigated, cache cleared)	Call `extract` again; retry with a fresh uid
`timeout`	Action did not complete within the action-specific window	If transient, retry once after `wait_for`; otherwise stop
`debugger_attach_failed`	DevTools / another debugger busy	Ask user to close DevTools on the tab; don’t retry
`invalid_action`	Schema / unknown action, OR JS error thrown inside `evaluate`	Inspect `message`, fix the request; don’t retry blindly
`internal_error`	Unexpected failure	Stop; report the error message

This contract is encoded in the browser tool’s prompt (see packages/backend/src/agent/tools/browser.tool.ts), so the model sees these handling rules every time the tool is loaded.

Audit log

Every session start and end is appended to chrome.storage.local under the sessionHistory key, capped at the most recent 1000 entries. The Options page renders the log as a reverse-chronological table with:

When — timestamp
Event — START or END (color-coded)
Domain + tabId
Detail — for END: reason, duration, action count

Users can clear the log with a single button (confirm dialog included). The log lives primarily locally, but session_started / session_ended / domain_blocked / global_stop / tab_closed WS events also flow to the backend for operational observability.

Design note: the audit module (session-history.ts) subscribes to a session-manager event bus (onSessionEvent) rather than being called explicitly from every endpoint. This keeps session-manager.ts free of audit concerns — if the log format changes, only one module is touched. Every termination path flows through the same bus, so the log captures every path automatically, and a single WS-forwarder subscription in background.ts keeps the server’s view consistent too.