Session Model
UX-first session model for an internal tool — bookkeeping, not consent. Auto-creation on first action, tabId as primary key, multi-session semantics, and the lifecycle events that terminate a session.
Why a session model
BUA is an internal tool. Sessions are bookkeeping for debugger-attach lifecycle + audit, not consent tokens. There is no TTL, no approval step, no “Trust this site” flow. A session exists on a tab from the moment the agent first acts there, and ends when the lifecycle says so (tab closed, user idle, user-stopped, blocklisted).
The extension still enforces one hard boundary: a domain blocklist, checked before every action. Everything else is recorded (audit log) but not gated.
Session shape
interface ActiveSession {
tabId: number; // primary key
domain: string; // current domain (updated silently on navigation)
startedAt: number; // epoch ms of creation
lastActionAt: number; // epoch ms of most recent action dispatch
actionCount: number;
}
Persisted in chrome.storage.local under activeSessions. Writes are serialized via an in-memory promise chain
(withSessionWriteLock) — chrome.storage.local has no CAS, so concurrent mutations would clobber each other without
the lock.
Session identity
Sessions are keyed by tabId, not by domain. A single domain can host multiple concurrent sessions when the site
is open on more than one tab. This is intentional: it lets the agent work on its own isolated tab (in the BUA window)
while the user continues browsing the same domain on their own tab.
CDP commands target a specific tabId, so actions on one session never bleed into the other. The popup lists every active session with its tabId so the user can stop one without affecting siblings on the same domain.
Disambiguation when the agent omits tabId
Most actions (click, type, extract, …) accept an optional tabId. If it’s omitted, the extension picks in this
order:
- Any BUA-owned session (the agent’s own tabs are always safe targets).
- A user-owned session only if it’s the one and only candidate.
- Otherwise, fails with
session_not_foundlisting the candidates so the agent retries with an explicittabId.
Prompt guidance tells the agent to track tabId from open_tab’s return value and pass it on every follow-up.
Omitting tabId is a convenience for the single-session case, not a guessing game.
How sessions are created
Two paths, both without user approval:
| Path | How | Landing tab |
|---|---|---|
| Auto — first action | Agent sends any action for a tab that has no session yet | That tab (wherever it lives) |
Explicit — open_tab | Browser subagent calls open_tab(url) | New tab in the BUA window by default |
Auto-creation on first action
When the agent dispatches an action for tabId=T on domain=D, the action-dispatcher flow is:
- Resolve
tabId+domain(from the action payload + live tab state). - Check
isDomainBlocked(domain)— if hit, returndomain_blocked(terminal), firedomain_blockedWS event, no session created. - Call
ensureSession(tabId, domain):- No session yet →
startSession+SessionEvent { type: "start" }→ firessession_startedWS event. - Session exists for same domain → return existing.
- Session exists for different domain (tab navigated mid-session) → silently update
session.domainto the new value, no event, no WS churn. Audit log records every action with current URL anyway.
- No session yet →
- Acquire per-tab lock.
- Attach
chrome.debugger(if not already). - Execute CDP action.
recordAction(tabId)— bumpactionCount+lastActionAt.
The whole “consent” step from v3 is gone. Authorization is implicit; accountability is in the audit log.
open_tab (agent-initiated)
Agent calls browser({ action: { type: "open_tab", url: "https://…" } }). The dispatcher:
- Checks the domain blocklist (same rule as auto-create).
- Creates a new tab in the BUA window (default: minimized, unfocused, hidden from user).
focus: trueoverrides to put the tab in the user’s focused window — use sparingly, only when the user explicitly asked to “show me this”. - Calls
startSession(domain, newTabId)→ firessession_startedWS event. - Returns
{ tabId, windowId, domain }— subagent keepstabIdfor all follow-up actions.
The BUA window
The BUA window is a dedicated Chrome window the extension creates for agent-initiated browsing. It exists so the agent can act on logged-in pages without disturbing the user’s main browser.
Invariants:
- Hidden by default — created with
chrome.windows.create({ focused: false, state: "minimized" }). Never steals focus; the user opts in to watching via the popup’s Show button. - Lazy creation — only spun up on the first agent-initiated
open_tab. If the agent never opens its own tab, no BUA window exists. - Auto-collapse — when the last BUA-owned tab closes, the window is removed so abandoned minimized windows don’t pile up.
- Id persisted — the window id lives in
chrome.storage.local(buaWindowId) to survive service-worker restarts; achrome.windows.onRemovedlistener forgets it if the user manually closes the window.
Routing rules:
- Agent
open_tab→ BUA window, tab opened withactive: false, session auto-created. - Agent
open_tab({ focus: true })→ user’s focused window withactive: true. Use only when the user explicitly asked to see the result.
How sessions end
Six termination reasons, one cleanup path. All paths converge on endSessionByTab(tabId, reason) or
endAllSessions(reason), which fires a SessionEvent { type: "end" } → auto-forwarded as session_ended WS event.
| Reason | Trigger |
|---|---|
tab_closed | chrome.tabs.onRemoved fires — the normal one-off task path |
user_stopped | Popup per-session “Stop now”, OR Chrome yellow-bar “Cancel” → onDetach |
system_idle | chrome.idle reports 30 min of no input machine-wide |
screen_locked | chrome.idle reports the screen was explicitly locked |
domain_blocked | User added this domain to the blocklist → endAllSessionsForDomain fans out |
global_stop | Popup red “Stop all” → endAllSessions fans out |
extension_reload | SW was unloaded / extension updated; not usually surfaced — sessions reconcile |
All six share the same cleanup side-effects: detach chrome.debugger (idempotent), close any BUA-owned host tab so
abandoned minimized tabs don’t pile up, collapse the BUA window if it’s now empty, reject any in-flight actions on
that tab with session_not_found.
The two user-presence reasons — system_idle and screen_locked — are split deliberately rather than collapsed
into one. They represent different security postures and deserve distinct audit entries:
screen_locked— the user actively secured the machine. Ongoing agent control under the locked state would be running under a different trust context than when the agent was started; the extension revokes immediately onchrome.idle.onStateChanged("locked").system_idle— no input observed on the whole machine for the detection interval (30 min default). Softer signal: the user may have stepped away, or be reading something without typing. Revocation bounds “forgotten session” risk without being aggressive about short pauses.
Both fire via the same idle-session-guard module; both take the same cleanup path as any other termination; the
audit log preserves which one fired so oncall / UI can surface the specific remediation hint.
Debugger-detach bug class, now closed
chrome.debugger.onDetach can fire for reasons outside extension control (user clicks yellow-bar Cancel, Chrome
revokes the attachment, DevTools takes over). In v3, handleDebuggerDetached called endSessionByTab but did NOT
clear the extension’s in-memory attached: Set<number>. A subsequent attach() attempt would see the stale set,
early-return as “already attached”, and all CDP commands would silently hang. The v4 implementation always calls
debuggerController.detach(tabId) first on this path — detach() is idempotent, so double-calling is safe.
User-visible surfaces
Two surfaces keep the user in control at all times: Chrome’s native debugger bar (system-level, non-suppressible) and the extension popup (activity monitor + global kill). Both trigger the same cleanup path in the background.
Popup — Stop all
The popup’s red “Stop all” button ends every active session on every tab at once and fires a global_stop WS
event so the backend can short-circuit any in-flight requests even before the per-session session_ended events
arrive.
Per-session Stop buttons remain available in the popup’s session list for surgical revocation. Both paths go through
sessionManager.endSessionByTab / endAllSessions → same SessionEvent bus → same WS forwarding.
No confirm step — the user already decided.
Chrome’s debugger bar
Whenever the extension attaches chrome.debugger to a tab, Chrome injects a yellow banner that reads “Zapvol
Browser Bridge is debugging this browser” with a Cancel button. This is not suppressible — Chrome
intentionally makes it loud to warn users that a debugger is reading / writing inputs.
Clicking Cancel triggers chrome.debugger.onDetach, which the extension handles by first clearing the in-memory
attached state and then ending the session with reason: "user_stopped" — same cleanup path as popup’s Stop.
Every BUA tool in the industry with local-browser access (Anthropic’s computer-use demo, Browser Use’s local mode,
any extension-based automation) deals with the same banner. It is the cost of doing real-input automation on the
user’s own profile. For internal deployments, IT can hide it via Chrome’s --silent-debugger-extension-api launch
flag at the OS level; this is a deployment concern, not an extension one.
Error-handling contract (agent side)
The agent treats the following as terminal for the current action — no silent retries:
| Error code | What it means | What the agent should do |
|---|---|---|
domain_blocked | Target domain is on the blocklist | Stop this plan; do not retry on a different tab or route through a different action |
session_not_found | No active session on resolved target, or multi-session ambiguity | Pass tabId explicitly; otherwise the extension refuses to guess |
tab_not_found | Tab was closed | Stop or try another tab (get_tabs first) |
element_not_found | selector or live uid matched nothing on the current DOM | Re-extract the page; pick a different target; don’t retry the same one |
element_stale | Supplied uid no longer cached (page navigated, cache cleared) | Call extract again; retry with a fresh uid |
timeout | Action did not complete within the action-specific window | If transient, retry once after wait_for; otherwise stop |
debugger_attach_failed | DevTools / another debugger busy | Ask user to close DevTools on the tab; don’t retry |
invalid_action | Schema / unknown action, OR JS error thrown inside evaluate | Inspect message, fix the request; don’t retry blindly |
internal_error | Unexpected failure | Stop; report the error message |
This contract is encoded in the browser tool’s prompt (see packages/backend/src/agent/tools/browser.tool.ts), so
the model sees these handling rules every time the tool is loaded.
Audit log
Every session start and end is appended to chrome.storage.local under the sessionHistory key, capped at the
most recent 1000 entries. The Options page renders the log as a reverse-chronological table with:
- When — timestamp
- Event —
STARTorEND(color-coded) - Domain + tabId
- Detail — for
END: reason, duration, action count
Users can clear the log with a single button (confirm dialog included). The log lives primarily locally, but
session_started / session_ended / domain_blocked / global_stop / tab_closed WS events also flow to the
backend for operational observability.
Design note: the audit module (session-history.ts) subscribes to a session-manager event bus (onSessionEvent)
rather than being called explicitly from every endpoint. This keeps session-manager.ts free of audit concerns — if
the log format changes, only one module is touched. Every termination path flows through the same bus, so the log
captures every path automatically, and a single WS-forwarder subscription in background.ts keeps the server’s view
consistent too.