Skip to content

feat(telemetry): instrument workspace lifecycle#963

Open
EhabY wants to merge 1 commit into
mainfrom
feat/issue-906-workspace-telemetry
Open

feat(telemetry): instrument workspace lifecycle#963
EhabY wants to merge 1 commit into
mainfrom
feat/issue-906-workspace-telemetry

Conversation

@EhabY
Copy link
Copy Markdown
Collaborator

@EhabY EhabY commented May 18, 2026

Summary

Records local telemetry for the workspace state machine and watcher so build durations and agent transitions show up alongside the rest of the local telemetry stream.

Workspace events (src/instrumentation/workspace.ts, split into three single-purpose classes):

  • workspace.state_transitioned: deduped on (status, transition, reason); emits observedDurationMs between transitions and observedBuildDurationMs when a provisioner run resolves.
  • workspace.agent.state_transitioned: deduped on (status, lifecycle_state); emits observedDurationMs.
  • workspace.start.triggered / workspace.update.triggered: traced spans around user-initiated workspace operations.

All workspace events carry workspaceName. The workspace event uses bare from/to (single state dimension); the agent event uses qualified fromStatus/toStatus plus fromLifecycleState/toLifecycleState because two dimensions can change in the same emission. Cross-event consistency via dimension-as-prefix property keys is tracked in #954. WorkspaceMonitor and WorkspaceStateMachine now take the ServiceContainer directly so they can pull the telemetry service.

Split

Stacked on #962. Second of two PRs replacing #948 (closing #906 in two pieces). Review against the auth branch shows only the workspace-specific diff; rebase to main once #962 merges.

@EhabY EhabY self-assigned this May 18, 2026
@EhabY EhabY requested a review from ethanndickson May 18, 2026 11:34
EhabY added a commit that referenced this pull request May 18, 2026
Record local telemetry for the OAuth and 401-recovery paths so latency
and failure modes show up alongside the rest of the local telemetry
stream.

Auth events (src/instrumentation/auth.ts):

- `auth.token_refreshed`: OAuth refresh attempt with `trigger`
  (`background` / `reactive`).
- `auth.token_refresh.deduped`: emitted when a refresh call joins an
  in-flight refresh so the dropped trigger stays countable.
- `auth.unauthorized_intercepted`: recovery path for a 401, with
  `recovery` (`refresh_success` / `login_required` / `none`) and
  `refreshAttempted`.
- `auth.login_prompted`: modal login prompt outcome, with `trigger`
  and a `reason` on abort/failure.

Shared telemetry plumbing picks up a `measurements` argument on
`TelemetryService.trace`, span outcome helpers (`markFailure` /
`markAborted`), and typed properties on `Span`. The SSH and WebSocket
instrumentation modules migrate to the typed property API. Test
helpers add `enableLocalTelemetry`, `TestSink.expectOne`, and
`createMockServiceContainer`.

First of two stacked PRs replacing #948. Closes part of #906; the
workspace half lands in #963.
Base automatically changed from feat/issue-906-auth-telemetry to main May 18, 2026 16:21
Records local telemetry for the workspace state machine and watcher so
build durations and agent transitions are visible alongside the rest of
the local telemetry stream.

- `workspace.state_transitioned`: deduped on
  `(status, transition, reason)`; emits `observedDurationMs` between
  transitions and `observedBuildDurationMs` when a provisioner run
  resolves.
- `workspace.agent.state_transitioned`: deduped on
  `(status, lifecycle_state)`; emits `observedDurationMs`.
- `workspace.start.triggered` / `workspace.update.triggered`: traced
  spans around user-initiated workspace operations.

All workspace events carry `workspaceName`. The workspace event uses
bare `from`/`to` (single state dimension); the agent event uses
qualified `fromStatus`/`toStatus` plus `fromLifecycleState`/
`toLifecycleState` because two dimensions can change in the same
emission. Cross-event consistency via dimension-as-prefix property keys
is tracked in #954.

`WorkspaceMonitor` and `WorkspaceStateMachine` take the
`ServiceContainer` directly so they can pull the telemetry service.

Closes part of #906.
@EhabY EhabY force-pushed the feat/issue-906-workspace-telemetry branch from c61ff87 to 917a7de Compare May 18, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant