Skip to content

feat: restore checkpoint after any AI response turn#2356

Draft
Basit-Balogun10 wants to merge 46 commits into
PostHog:mainfrom
Basit-Balogun10:claude/gracious-hawking-38aade
Draft

feat: restore checkpoint after any AI response turn#2356
Basit-Balogun10 wants to merge 46 commits into
PostHog:mainfrom
Basit-Balogun10:claude/gracious-hawking-38aade

Conversation

@Basit-Balogun10
Copy link
Copy Markdown

@Basit-Balogun10 Basit-Balogun10 commented May 25, 2026

Note: This PR is stacked on #2321. The diff shows +2k lines because GitHub uses main as the base - cross-fork branches can't be set as PR bases (tried using Graphite to track the stack too, same limitation). The actual changes in this PR are a few commits on top of #2321. Please review with that context, or we wait for #2321 to be merged first.

Problem

After an AI turn completes, there's no way to roll back to the git state captured at the end of that turn. If the agent goes down a wrong path you have to manually reset the branch or throw away work — there's no checkpoint-based undo.

Tracked in #724 / #2328.

Changes

Core restore flow

  • New checkpoint tRPC router (routers/checkpoint.ts) with a restore procedure that:
    1. Runs RevertCheckpointSaga to reset the worktree to the saved git state
    2. Truncates the session .jsonl to the restore point so the session replays correctly
    3. Collects any orphaned checkpoint refs from the discarded lines and deletes them — no accumulation of stale refs from abandoned future turns
  • getSessionInfo(taskRunId) added to AgentService to expose { sessionId, repoPath } without leaking internal types

Per-turn restore button

  • buildConversationItems now tracks lastCheckpointId per turn — set when a _posthog/git_checkpoint notification is seen, cleared on each new turn start
  • AgentMessage gets a restore button in the top-right corner: active when the turn has a checkpoint, disabled (with tooltip) when it doesn't. Only shown on completed turns.
  • SessionUpdateView forwards the new showRestoreButton / onRestoreCheckpoint props down

Checkpoint timeline modal (mod+shift+h)

  • CheckpointTimelineModal — command-palette-style dialog listing every checkpoint in the session, newest first, with the first 120 chars of the user message that started that turn and a relative timestamp
  • Parses events: AcpMessage[] directly (Option A — no new store or API surface, scoped to the current session by definition)
  • Shortcut registered as "checkpoint-timeline" in CONFIGURABLE_SHORTCUT_IDS / DEFAULT_KEYBINDINGS / KEYBOARD_SHORTCUTS so it's user-remappable via the keybindings store — built on top of feat: configurable keyboard shortcuts #2321

Restore confirmation

  • RestoreCheckpointDialog — confirmation dialog with an amber warning before the restore runs, so accidental clicks don't immediately drop work
  • useRestoreCheckpoint hook wires the full flow: opens the confirmation dialog → calls the tRPC mutation → truncates the in-memory events via sessionStoreSetters.truncateEventsToCheckpoint → shows a success/error toast

How did you test this?

  • Ran pnpm --filter code typecheck against tsconfig.node.json and tsconfig.web.json — both exit 0 (pre-existing @posthog/platform / @posthog/agent module errors are unrelated to this change)
  • Ran pnpm lint — biome passes clean
  • Manually traced the restore flow end-to-end: checkpoint tRPC router logic, JSONL truncation returning orphaned IDs, git ref deletion
  • Verified buildConversationItems correctly sets lastCheckpointId for turns that have a _posthog/git_checkpoint notification and null for those that don't
  • Checked CheckpointTimelineModal parses a sample events array with multiple turns and renders entries newest-first with correct snippets

Users can now remap any of the 17 configurable shortcuts via Settings >
Shortcuts (or the ⌘/ sheet). Custom bindings fully replace all defaults
(including alternates) and multiple custom combos per action are supported.
Bindings persist across sessions via electronStorage.

- Add `configurable` flag + `DEFAULT_KEYBINDINGS` map to keyboard-shortcuts.ts
- New `keybindingsStore` (persist + electronStorage) with array-based custom combos,
  conflict detection helper, and individual/bulk reset
- New `useShortcut(id)` hook — reactive Zustand selector, feeds useHotkeys
- New `Keycap` component extracted to avoid circular imports
- New `ShortcutRecorder` component: click + to enter recording mode, captures
  keydown, shows conflict toast, per-binding × remove, per-shortcut ↩ reset
- Update all useHotkeys call sites (GlobalEventHandlers, SpaceSwitcher,
  usePanelKeyboardShortcuts, ExternalAppsOpener) to use useShortcut()
- KeyboardShortcutsSheet: configurable rows render ShortcutRecorder instead of
  static keycaps; "Reset all shortcuts" button shown when customisations exist

Generated-By: PostHog Code
Task-Id: 80405bf7-239f-4b60-a1cf-5a4777fb7218
Bare letter keys (e.g. just "k") would fire every time that character is
typed anywhere in the app. Require at least mod/ctrl/alt to be held.

Generated-By: PostHog Code
Task-Id: 80405bf7-239f-4b60-a1cf-5a4777fb7218
24 tests covering resolveKey, addKeybinding, removeKeybinding,
resetShortcut, resetAll, getKey, and findConflict — including
conflict detection against comma-separated default alternates.

Generated-By: PostHog Code
Task-Id: 80405bf7-239f-4b60-a1cf-5a4777fb7218
- KeyboardShortcutsSheet header now reads the "shortcuts" key via
  useShortcut() so the trigger keycap updates when remapped
- ExternalAppsOpener dropdown labels for open-in-editor and copy-path
  now derive from useShortcut() + formatHotkeyParts() instead of
  hardcoded Mac-only symbols

test(e2e): add Playwright shortcut sheet tests

Covers sheet open/close, category sections, hover controls, recording
mode entry/cancellation, bare-key rejection, saving bindings, conflict
detection, removing bindings, per-shortcut reset, and reset-all.

Generated-By: PostHog Code
Task-Id: 80405bf7-239f-4b60-a1cf-5a4777fb7218
Hardcoded Cmd glyphs were leaking onto Windows in the send-messages
dropdown and the tiptap paste hint, and two handlers were gated on
metaKey only so the corresponding shortcut never fired on Windows
(mod+1..9 task switching, Cmd/Ctrl-click multi-select in the inbox).

Generated-By: PostHog Code
Task-Id: 80405bf7-239f-4b60-a1cf-5a4777fb7218
- Add prompt-history-prev/next to CONFIGURABLE_SHORTCUT_IDS and
  DEFAULT_KEYBINDINGS so they appear in the shortcuts sheet and
  can be rebound like any other shortcut
- Add tiptapEventToCombo() — accepts shift-only combos (no Ctrl/Meta
  required) so shift+up/down can be matched against live bindings
- Fix eventToCombo() to normalise Arrow-prefixed key names (ArrowUp to up)
- Wire useTiptapEditor to resolve prompt-history keys from the store
  instead of hardcoding event.shiftKey
- Fix paste hint toast to show the live paste-as-file binding instead
  of the hardcoded mod+shift+v string
- Fix noStaticElementInteractions lint on recording modal backdrop
- Rewrite E2E shortcut tests to match the current recording modal UI
  (chips + right-click context menu) rather than the old hover-button
  and inline-input design
- Deduplicate in updateKeybinding — conflict detection excludes the
  shortcut being edited so editing one binding to match another on the
  same shortcut could produce ["ctrl+q","ctrl+q"], duplicate React keys
  and broken chip reconciliation
- Remove ArrowUp/Down gate around prompt-history navigation so custom
  non-arrow bindings (e.g. Ctrl+K) actually fire when pressed, not just
  when the physical key is an arrow
- Remove obvious section-divider comments and redundant JSX labels
  (Header, Scrollable list, Sticky footer); keep non-obvious rationale
  comments (window-level capture, backdrop dismiss, canAddMore budget,
  dedup note, ArrowKey gate explanation)
- Add checkpoint tRPC router with restore procedure that reverts git state,
  truncates session JSONL to the restore point, and deletes orphaned
  checkpoint refs for abandoned future turns
- Track lastCheckpointId per turn in buildConversationItems so each
  completed agent turn knows its git ref
- Show per-turn restore button in AgentMessage (disabled with tooltip when
  no checkpoint exists for that turn)
- Add CheckpointTimelineModal (mod+shift+h) — command-palette-style list of
  all checkpoints in the session, newest first, with user message snippet
  and relative timestamp; shortcut is user-remappable via keybindings store
- Add RestoreCheckpointDialog with confirmation warning before reverting
- Add useRestoreCheckpoint hook to wire restore flow end-to-end
- Register checkpoint-timeline as a configurable shortcut

Closes PostHog#2328
@Basit-Balogun10 Basit-Balogun10 force-pushed the claude/gracious-hawking-38aade branch from 7dc1907 to 021fe43 Compare May 26, 2026 15:35
pauldambra and others added 18 commits May 27, 2026 11:28
Co-authored-by: Charles Vien <charles.v@posthog.com>
## Problem

Signals inbox items previously surfaced only in-app. Users wanting a heads-up when a new item lands and they're a suggested reviewer had no way to opt in.

Even after the initial Slack notifications work landed, picking a channel still required a detour through PostHog Web — users had to leave PostHog Code to install the Slack integration, then come back. That's too much friction for what should be a one-time setup.

## Changes

![Screenshot 2026-05-19 at 08.52.23@2x.png](https://app.graphite.com/user-attachments/assets/51720fb7-3f89-4f77-a2d4-e8a89cad9a12.png)

Adding Slack notifications under **Inbox** source settings.

What it includes:

- Slack workspace picker (shown only when more than one Slack `Integration` is connected; otherwise the single workspace is implied).
- Notification channel picker, populated from `/api/environments/{teamId}/integrations/{id}/channels/` - same endpoint as Insight Alerts. Includes an explicit "Off" option.
- Minimum priority filter (also includes an "All priorities" option).

Also, in-app Slack connect flow so users no longer have to leave PostHog Code for PostHog Web.

- `SlackIntegrationService` in the main process. Mirrors the existing `GitHubIntegrationService` but registers a separate `slack-integration` deep-link key so each provider's handler stays isolated.
- `slackIntegration` tRPC router.
- The empty state in `SignalSlackNotificationsSettings` shows a "Connect Slack workspace" button that runs connection without redirecting to the Integrations page of Cloud settings.

Pairs with the backend changes in PostHog/posthog#58774.

## How did you test this?

Ran this locally.

## Publish to changelog?

no

---

_Created with_ [_PostHog Code_](https://posthog.com/code?ref=pr)
Co-authored-by: Tom Owers <owerstom@gmail.com>
Co-authored-by: Annika <14750837+annikaschmid@users.noreply.github.com>
Co-authored-by: Oliver Browne <oliver@posthog.com>
Co-authored-by: Oliver Browne <oliverbrowne627@gmail.com>
Co-authored-by: Ioannis J <yiannis@posthog.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ostHog#2296)

## Problem

I removed my personal integration from my PostHog account, and then I just had no "Cloud" button in new task mode selector – only "Local" and "Worktree", not even a "Connect GitHub for Cloud agents" button. Leaving no way for me to see the error message or reconnect their account.

## Changes

Not hiding the "Cloud" option anymore when "Cloud is not available". Seem to have been introduced in PostHog#2045, but I don't quite understand what was the intention of _this specific_ guard.

## How did you test this?

Ran this and indeed was able to connect GH.

---
*Created with [PostHog Code](https://posthog.com/code?ref=pr)*
Co-authored-by: Adam Bowker <adam.b@posthog.com>
…storm (PostHog#2255)

## Summary

Layer 1 of a multi-PR fix for the launch hang and "can't click cloud task" symptoms reported by cloud-task users.

Extracts local NDJSON cache handling (`readLocalLogs` / `writeLocalLogs`) into a singleton `LocalLogsService` and **single-flights writes per `taskRunId` with latest-wins coalescing**. If a write is already in flight when another arrives for the same run, the new content replaces any queued content rather than spawning a parallel `fs.promises.writeFile`.

### Why this is needed

When the renderer's gap-reconcile loop fires on every SSE snapshot — which happens whenever `parseLogContent` silently drops corrupted lines and `processedLineCount` never catches up to the server's `expectedCount` — the old fire-and-forget `writeLocalLogs` piles `fs.promises.writeFile` continuations onto the main process, producing the `FileHandle::CloseReq::Resolve` saturation signature we saw at app launch.

This is one of two distinct main-thread hangs reported on the same crash signature:
- [PostHog#2242](PostHog#2242) (merged) addressed the startup `unzipSync` path that affected all users.
- This PR addresses the cloud-task corruption-feedback loop that only manifested for users with cloud tasks.

### What this does NOT fix

Stops the storm, but doesn't address the underlying corruption-amplification loop or unbounded reconnects — those are layer 2 and 3 below.

## Follow-ups (separate PRs to be stacked on this one)

**Layer 2 — break the corruption-amplification feedback loop.**
`parseLogContent` (renderer `service.ts:3539`) silently drops malformed lines, so `processedLineCount < expectedCount` forever and every SSE snapshot triggers another gap-reconcile + S3 fetch + overwrite. Need to either:
- Track dropped-line counts and feed them into the reconciliation math so a known-corrupted file stops triggering gap-reconcile after one observation, or
- Hash-compare local vs S3 content and short-circuit re-write when they match.

Fixes the "can't click on cloud task" symptom for users whose local NDJSON is already poisoned.

**Layer 3 — bound the reconnect budget.**
`MAX_SSE_RECONNECT_ATTEMPTS = 5` (`cloud-task/service.ts:21`) is defeated by two paths:
- `handleStreamCompletion` reconnects with `countAttempt: false` for non-terminal clean EOF (`cloud-task/service.ts:1057`).
- `retry` / `retryUnhealthyCloudSessions` resets the counter on every focus.

Need a per-run cumulative cap and an explicit unrecoverable terminal state so the UI can surface "this run is broken" instead of looping silently.

**Separate ticket — S3 source corruption.**
The agent's local writer (`packages/agent/src/session-log-writer.ts:391`) correctly appends `\n`, but user logs show records concatenated without separators across days. Missing newlines are being introduced somewhere in the agent-server upload/aggregation path. This PR limits the *blast radius* of corruption but doesn't stop it from being produced.

## Test plan

- [x] Unit tests cover: single-flight coalescing, multi-run independence, propagation of in-flight resolution to all coalesced callers, recovery after write rejection, queue draining after completion.
- [x] `pnpm --filter code typecheck` clean.
- [x] `pnpm --filter code test` — new tests pass; remaining failures are pre-existing archive integration tests unrelated to this change.
- [ ] Manual verification on a dev build with cloud tasks (post-merge).
k11kirky and others added 17 commits May 27, 2026 11:29
…iciency (PostHog#2284)

## Problem

Cloud log reconciliation could get stuck in an infinite loop when log files contained corrupted or unparseable lines. Because `processedLineCount` was never advanced past the gap, every new snapshot delta would re-trigger a reconcile that could never succeed — either because lines were permanently malformed (proven corruption) or because S3 simply wasn't catching up.

## Changes

Introduced two early-exit conditions in the reconcile loop that commit a best-effort state and advance `processedLineCount` past the gap:

1. **Parse failures detected on first observation** — if any lines in the fetched log fail `JSON.parse`, the corruption is permanent and S3 will never fix it. The reconcile breaks immediately rather than waiting.
2. **Same deficiency observed twice in a row** — if a second reconcile produces the same `(expectedCount, observedLineCount)` pair as the previous one, S3 is not catching up. The reconcile breaks on the second observation.

In all other cases the deficiency is treated as transient lag and the reconcile waits for the next snapshot update as before.

A `parseFailureCount` field was added to `ParsedSessionLogs` so the reconcile handler can distinguish between "fewer lines than expected" and "lines exist but are corrupt." A `cloudLogReconcileDeficiency` map tracks the last observed deficiency per `taskRunId` and is cleaned up on session removal, watcher teardown, and full reset.

## How did you test this?

Two new unit tests were added:

- **"breaks the reconcile loop on first observation when parse failures are present"** — feeds a log with mixed valid and malformed JSON lines and asserts that `processedLineCount` is advanced to `expectedCount` on the first reconcile attempt.
- **"breaks the reconcile loop after a repeated stable deficiency"** — feeds a log with fewer parseable lines than the server-reported count, fires two identical snapshot updates, and asserts that `processedLineCount` is only advanced after the second (repeated) observation.

## Publish to changelog?

No
…PostHog#2285)

## Problem

When a cloud run stream repeatedly closes with a clean EOF (no error), the per-attempt reconnect counter was being reset to zero on each clean disconnect. This allowed the watcher to loop indefinitely without ever hitting the `MAX_SSE_RECONNECT_ATTEMPTS` limit, leaving users stuck watching a run that can never be reached.

## Changes

A new `cumulativeReconnectAttempts` counter was added to the watcher state that increments on every reconnect attempt regardless of whether the disconnect was clean or errored. Unlike `reconnectAttempts`, it is never reset by a clean EOF — only by a successful SSE event or confirmed in-progress poll response. If the cumulative count exceeds `MAX_CUMULATIVE_RECONNECT_ATTEMPTS` (30), the watcher is failed with a retryable "Cloud run unreachable" error.

The previous behavior of resetting `reconnectAttempts` to zero on clean EOF was removed, since the cumulative counter now handles the runaway loop case independently.

## How did you test this?

A new automated test was added that mocks the stream to always return a clean empty SSE response and verifies that after exhausting the cumulative reconnect budget, the watcher emits an error update with `kind: "error"` and the expected `"Cloud run unreachable"` message.

## Publish to changelog?

No
…ostHog#2310)

## Summary

Submitting a new task used to leave the user on `TaskInput` (with a spinning submit button) until the saga finished creating the task, folder, and workspace. Only then did `onTaskReady` fire and navigate to `TaskDetail`, where `SessionView` showed a full-screen spinner until the agent session connected and `applyOptimisticPrompt` wrote the user message into the optimistic store. The flow felt sluggish because each step blocked the UI from showing the prompt.

This change navigates to a thread-style view synchronously on submit:

- New `task-pending` view in `navigationStore` (transient — excluded from persistence; replaced in history on transition so back doesn't land on a stale placeholder).
- `pendingTaskPromptStore` holds the prompt text keyed first by a client-generated UUID, then re-keyed to the real task id once the saga returns.
- `PendingChatView` renders the user-message bubble + "Connecting to agent..." footer with the same layout as `SessionView`'s connected state. `TaskPendingView` wraps it for the view-router; `SessionView`'s initializing branch also renders it when a pending entry exists, bridging the gap until `applyOptimisticPrompt` fires.
- `useTaskCreation.handleSubmit` stashes the prompt, navigates to the pending view, then runs the saga. On failure it clears the pending entry and navigates back to `task-input` with `initialPrompt` preserved.
- `MainLayout`, `useSidebarData`, and `TaskListView` treat `task-pending` like `task-input` for sidebar/SpaceSwitcher state. `CommandCenterPanel`'s `onTaskCreated` override skips the pending view so its existing flow is untouched.

https://github.com/user-attachments/assets/8153c233-d77d-445c-b683-eeba49f1d59e

## Test plan

- [x] `pnpm --filter code typecheck` clean
- [x] 2 new navigation-store tests cover `navigateToPendingTask` and the history-replace behavior
- [x] All 914 renderer tests pass
- [ ] Manual: submit a local task — confirm thread + prompt appear immediately, then spinner, then agent response streams in
- [ ] Manual: submit a worktree task — pending view stays during provisioning, transitions seamlessly to `TaskDetail`
- [ ] Manual: submit a cloud task — pending view shows during saga's cloud setup, then `CloudInitializingView` takes over
- [ ] Manual: simulate a saga failure — pending view goes back to `task-input` with the prompt restored
- [ ] Manual: submit from `CommandCenterPanel` — no pending view; existing flow unchanged
- [ ] Manual: press back from `TaskDetail` after submit — lands on `task-input`, not the empty pending placeholder
## Problem

Image extension sets and MIME mappings were duplicated across 9+ files, causing inconsistent behavior across paste, drag/drop, attachments, thumbnails and Claude API conversion.

<!-- Who is this for and what problem does it solve? -->

<!-- Closes #ISSUE_ID -->

## Changes

1. Add `@posthog/shared/image` as canonical source: `IMAGE_MIME_TYPES`, `ALLOWED_IMAGE_MIME_TYPES`, `CLAUDE_IMAGE_EXTENSIONS`, `ClaudeImageMimeType`, plus `isImageFile` / `isRasterImageFile` / `isClaudeImageFile` /  
`isGifFile` / `getImageMimeType` / `parseImageDataUrl`
2. Add `@posthog/shared/binary` with `BINARY_EXTENSIONS` and `isBinaryFile`, image portion derived from the canonical image set
3. Delete `apps/code/src/shared/constants/image.ts` and `apps/code/src/shared/utils/imageDataUrl.ts`; migrate 7 importers to `@posthog/shared`
4. Wire `@posthog/shared` into mobile (package.json dep + Metro alias to TS source)
5. Replace local `COMMON_IMAGE_EXTENSIONS` and `ImageMimeType` union in the Claude ACP adapter with shared exports
6. Switch `CodeEditorPanel` to `isRasterImageFile` so SVGs keep opening in CodeMirror (broad `isImageFile` now includes svg/heic)

<!-- What did you change and why? -->

<!-- If there are frontend changes, include screenshots. -->

## How did you test this?

Manually

<!-- Describe what you tested -- manual steps, automated tests, or both. -->

<!-- If you're an agent, only list tests you actually ran. -->

## Publish to changelog?

no

<!-- For features only -->

<!-- If publishing, you must provide changelog details in the #changelog Slack channel. You will receive a follow-up PR comment or notification. -->

<!-- If not, write "no" or "do not publish to changelog" to explicitly opt-out of posting to #changelog. Removing this entire section will not prevent posting. -->
The cloud path (agent-server.ts) guards checkpoint capture on posthogAPI
being configured, so local tasks never emit _posthog/git_checkpoint.

Hook into extNotification in the local AgentService: on TURN_COMPLETE, run
CaptureCheckpointSaga, then emit a synthetic _posthog/git_checkpoint ACP
message to the renderer and append it to the session JSONL so it survives
reload. The renderer's buildConversationItems already handles the
notification correctly — it just wasn't arriving.

Add console logs in buildConversationItems and structured logs in service.ts
for visibility during debugging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.