Session Model

UX-first session model for an internal tool — bookkeeping, not consent. Auto-creation on first action, tabId as primary key, multi-session semantics, and the lifecycle events that terminate a session.

Why a session model

BUA is an internal tool. Sessions are bookkeeping for debugger-attach lifecycle + audit, not consent tokens. There is no TTL, no approval step, no “Trust this site” flow. A session exists on a tab from the moment the agent first acts there, and ends when the lifecycle says so (tab closed, user idle, user-stopped, blocklisted).

The extension still enforces one hard boundary: a domain blocklist, checked before every action. Everything else is recorded (audit log) but not gated.

Session shape

interface ActiveSession {
  tabId: number;        // primary key
  domain: string;       // current domain (updated silently on navigation)
  startedAt: number;    // epoch ms of creation
  lastActionAt: number; // epoch ms of most recent action dispatch
  actionCount: number;
}

Persisted in chrome.storage.local under activeSessions. Writes are serialized via an in-memory promise chain (withSessionWriteLock) — chrome.storage.local has no CAS, so concurrent mutations would clobber each other without the lock.

Session identity

Sessions are keyed by tabId, not by domain. A single domain can host multiple concurrent sessions when the site is open on more than one tab. This is intentional: it lets the agent work on its own isolated tab (in the BUA window) while the user continues browsing the same domain on their own tab.

CDP commands target a specific tabId, so actions on one session never bleed into the other. The popup lists every active session with its tabId so the user can stop one without affecting siblings on the same domain.

Disambiguation when the agent omits tabId

Most actions (click, type, extract, …) accept an optional tabId. If it’s omitted, the extension picks in this order:

  1. Any BUA-owned session (the agent’s own tabs are always safe targets).
  2. A user-owned session only if it’s the one and only candidate.
  3. Otherwise, fails with session_not_found listing the candidates so the agent retries with an explicit tabId.

Prompt guidance tells the agent to track tabId from open_tab’s return value and pass it on every follow-up. Omitting tabId is a convenience for the single-session case, not a guessing game.

How sessions are created

Two paths, both without user approval:

PathHowLanding tab
Auto — first actionAgent sends any action for a tab that has no session yetThat tab (wherever it lives)
Explicit — open_tabBrowser subagent calls open_tab(url)New tab in the BUA window by default

Auto-creation on first action

When the agent dispatches an action for tabId=T on domain=D, the action-dispatcher flow is:

  1. Resolve tabId + domain (from the action payload + live tab state).
  2. Check isDomainBlocked(domain) — if hit, return domain_blocked (terminal), fire domain_blocked WS event, no session created.
  3. Call ensureSession(tabId, domain):
    • No session yet → startSession + SessionEvent { type: "start" } → fires session_started WS event.
    • Session exists for same domain → return existing.
    • Session exists for different domain (tab navigated mid-session) → silently update session.domain to the new value, no event, no WS churn. Audit log records every action with current URL anyway.
  4. Acquire per-tab lock.
  5. Attach chrome.debugger (if not already).
  6. Execute CDP action.
  7. recordAction(tabId) — bump actionCount + lastActionAt.

The whole “consent” step from v3 is gone. Authorization is implicit; accountability is in the audit log.

open_tab (agent-initiated)

Agent calls browser({ action: { type: "open_tab", url: "https://…" } }). The dispatcher:

  1. Checks the domain blocklist (same rule as auto-create).
  2. Creates a new tab in the BUA window (default: minimized, unfocused, hidden from user). focus: true overrides to put the tab in the user’s focused window — use sparingly, only when the user explicitly asked to “show me this”.
  3. Calls startSession(domain, newTabId) → fires session_started WS event.
  4. Returns { tabId, windowId, domain } — subagent keeps tabId for all follow-up actions.

The BUA window

The BUA window is a dedicated Chrome window the extension creates for agent-initiated browsing. It exists so the agent can act on logged-in pages without disturbing the user’s main browser.

Invariants:

  • Hidden by default — created with chrome.windows.create({ focused: false, state: "minimized" }). Never steals focus; the user opts in to watching via the popup’s Show button.
  • Lazy creation — only spun up on the first agent-initiated open_tab. If the agent never opens its own tab, no BUA window exists.
  • Auto-collapse — when the last BUA-owned tab closes, the window is removed so abandoned minimized windows don’t pile up.
  • Id persisted — the window id lives in chrome.storage.local (buaWindowId) to survive service-worker restarts; a chrome.windows.onRemoved listener forgets it if the user manually closes the window.

Routing rules:

  • Agent open_tab → BUA window, tab opened with active: false, session auto-created.
  • Agent open_tab({ focus: true }) → user’s focused window with active: true. Use only when the user explicitly asked to see the result.

How sessions end

Six termination reasons, one cleanup path. All paths converge on endSessionByTab(tabId, reason) or endAllSessions(reason), which fires a SessionEvent { type: "end" } → auto-forwarded as session_ended WS event.

ReasonTrigger
tab_closedchrome.tabs.onRemoved fires — the normal one-off task path
user_stoppedPopup per-session “Stop now”, OR Chrome yellow-bar “Cancel” → onDetach
system_idlechrome.idle reports 30 min of no input machine-wide
screen_lockedchrome.idle reports the screen was explicitly locked
domain_blockedUser added this domain to the blocklist → endAllSessionsForDomain fans out
global_stopPopup red “Stop all” → endAllSessions fans out
extension_reloadSW was unloaded / extension updated; not usually surfaced — sessions reconcile

All six share the same cleanup side-effects: detach chrome.debugger (idempotent), close any BUA-owned host tab so abandoned minimized tabs don’t pile up, collapse the BUA window if it’s now empty, reject any in-flight actions on that tab with session_not_found.

The two user-presence reasons — system_idle and screen_locked — are split deliberately rather than collapsed into one. They represent different security postures and deserve distinct audit entries:

  • screen_locked — the user actively secured the machine. Ongoing agent control under the locked state would be running under a different trust context than when the agent was started; the extension revokes immediately on chrome.idle.onStateChanged("locked").
  • system_idle — no input observed on the whole machine for the detection interval (30 min default). Softer signal: the user may have stepped away, or be reading something without typing. Revocation bounds “forgotten session” risk without being aggressive about short pauses.

Both fire via the same idle-session-guard module; both take the same cleanup path as any other termination; the audit log preserves which one fired so oncall / UI can surface the specific remediation hint.

Debugger-detach bug class, now closed

chrome.debugger.onDetach can fire for reasons outside extension control (user clicks yellow-bar Cancel, Chrome revokes the attachment, DevTools takes over). In v3, handleDebuggerDetached called endSessionByTab but did NOT clear the extension’s in-memory attached: Set<number>. A subsequent attach() attempt would see the stale set, early-return as “already attached”, and all CDP commands would silently hang. The v4 implementation always calls debuggerController.detach(tabId) first on this path — detach() is idempotent, so double-calling is safe.

User-visible surfaces

Two surfaces keep the user in control at all times: Chrome’s native debugger bar (system-level, non-suppressible) and the extension popup (activity monitor + global kill). Both trigger the same cleanup path in the background.

The popup’s red “Stop all” button ends every active session on every tab at once and fires a global_stop WS event so the backend can short-circuit any in-flight requests even before the per-session session_ended events arrive.

Per-session Stop buttons remain available in the popup’s session list for surgical revocation. Both paths go through sessionManager.endSessionByTab / endAllSessions → same SessionEvent bus → same WS forwarding.

No confirm step — the user already decided.

Chrome’s debugger bar

Whenever the extension attaches chrome.debugger to a tab, Chrome injects a yellow banner that reads “Zapvol Browser Bridge is debugging this browser” with a Cancel button. This is not suppressible — Chrome intentionally makes it loud to warn users that a debugger is reading / writing inputs.

Clicking Cancel triggers chrome.debugger.onDetach, which the extension handles by first clearing the in-memory attached state and then ending the session with reason: "user_stopped" — same cleanup path as popup’s Stop.

Every BUA tool in the industry with local-browser access (Anthropic’s computer-use demo, Browser Use’s local mode, any extension-based automation) deals with the same banner. It is the cost of doing real-input automation on the user’s own profile. For internal deployments, IT can hide it via Chrome’s --silent-debugger-extension-api launch flag at the OS level; this is a deployment concern, not an extension one.

Error-handling contract (agent side)

The agent treats the following as terminal for the current action — no silent retries:

Error codeWhat it meansWhat the agent should do
domain_blockedTarget domain is on the blocklistStop this plan; do not retry on a different tab or route through a different action
session_not_foundNo active session on resolved target, or multi-session ambiguityPass tabId explicitly; otherwise the extension refuses to guess
tab_not_foundTab was closedStop or try another tab (get_tabs first)
element_not_foundselector or live uid matched nothing on the current DOMRe-extract the page; pick a different target; don’t retry the same one
element_staleSupplied uid no longer cached (page navigated, cache cleared)Call extract again; retry with a fresh uid
timeoutAction did not complete within the action-specific windowIf transient, retry once after wait_for; otherwise stop
debugger_attach_failedDevTools / another debugger busyAsk user to close DevTools on the tab; don’t retry
invalid_actionSchema / unknown action, OR JS error thrown inside evaluateInspect message, fix the request; don’t retry blindly
internal_errorUnexpected failureStop; report the error message

This contract is encoded in the browser tool’s prompt (see packages/backend/src/agent/tools/browser.tool.ts), so the model sees these handling rules every time the tool is loaded.

Audit log

Every session start and end is appended to chrome.storage.local under the sessionHistory key, capped at the most recent 1000 entries. The Options page renders the log as a reverse-chronological table with:

  • When — timestamp
  • EventSTART or END (color-coded)
  • Domain + tabId
  • Detail — for END: reason, duration, action count

Users can clear the log with a single button (confirm dialog included). The log lives primarily locally, but session_started / session_ended / domain_blocked / global_stop / tab_closed WS events also flow to the backend for operational observability.

Design note: the audit module (session-history.ts) subscribes to a session-manager event bus (onSessionEvent) rather than being called explicitly from every endpoint. This keeps session-manager.ts free of audit concerns — if the log format changes, only one module is touched. Every termination path flows through the same bus, so the log captures every path automatically, and a single WS-forwarder subscription in background.ts keeps the server’s view consistent too.

Was this page helpful?