Skip to content

feat: add RCA commands and surface root cause analysis in checks get [AI-190]#1271

Merged
thebiglabasky merged 19 commits intomainfrom
herve/add-rca-check-results
Apr 2, 2026
Merged

feat: add RCA commands and surface root cause analysis in checks get [AI-190]#1271
thebiglabasky merged 19 commits intomainfrom
herve/add-rca-check-results

Conversation

@thebiglabasky
Copy link
Copy Markdown
Contributor

@thebiglabasky thebiglabasky commented Mar 27, 2026

Summary

RCA in checks get

  • Add RCA column (Yes/-) to the error groups table in checks get <id> so users can see at a glance which error groups have root cause analysis available
  • Add Error Group ID column to the error groups table for easy reference when triggering RCA
  • Render full RCA detail (classification, root cause, user impact, code fix, evidence, references) when drilling into checks get <id> --error-group <egId>
  • Transform JSON output to expose latestRootCauseAnalysis (singular, most recent) + rootCauseAnalysisCount instead of the raw array, to avoid surfacing potentially contradictory historical analyses to agents

RCA trigger and get commands

  • Add checkly rca run --error-group <id> to trigger a root cause analysis, with --watch to poll until complete
  • Add checkly rca get <rcaId> to retrieve an existing analysis, with --watch to wait if still generating
  • Both commands support --output detail|json|md formats
  • Polling uses HTTP 202 to detect in-progress analyses (404 is now a real "not found")

RCA API client

  • Add rca.trigger(errorGroupId) — POST to /v1/root-cause-analyses/error-groups/{errorGroupId}
  • Add rca.get(id) — GET from /v1/root-cause-analyses/{id}

Bug fix

  • Fix Windows CI failure in init command test — path assertion was hardcoded with Unix separators

Screenshot

image

Test plan

  • Run checkly rca run --error-group <egId> --watch — verify it triggers and polls until complete
  • Run checkly rca get <rcaId> — verify it retrieves a completed analysis
  • Run checkly rca get <rcaId> --watch — verify it polls on 202 and displays result on completion
  • Run checkly checks get <checkId> — verify RCA and Error Group ID columns appear in error groups table
  • Run checkly checks get <checkId> --error-group <egId> with an error group that has RCA — verify full RCA detail renders
  • Run checkly checks get <checkId> --output json — verify latestRootCauseAnalysis and rootCauseAnalysisCount in error groups
  • npx vitest --run passes all tests
  • Windows CI passes

🤖 Generated with Claude Code

thebiglabasky and others added 4 commits March 27, 2026 22:31
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@MichaelHogers MichaelHogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three observations from testing with CC:

  1. --result is a dead end. No link back to the error group or its RCA. Adding Error group: checkly checks get --error-group would help, especially for agents navigating programmatically. (need to double check this manually)

  2. RCA: - is ambiguous. Could mean not entitled, not triggered, or not available. For non-entitled accounts the API just returns an empty array — user has no idea RCA exists. A tip pointing to checkly account plan when the entitlement is missing would be a natural upsell moment.

  3. CLI can't trigger RCA (future). The public POST endpoint exists but CLI is read-only. A --analyze flag would close the loop.

2 is worth looking into imo ? might be easy to pull in given the earlier plan/entitlement work

@thebiglabasky
Copy link
Copy Markdown
Contributor Author

Three observations from testing with CC:

  1. --result is a dead end. No link back to the error group or its RCA. Adding Error group: checkly checks get --error-group would help, especially for agents navigating programmatically. (need to double check this manually)
  2. RCA: - is ambiguous. Could mean not entitled, not triggered, or not available. For non-entitled accounts the API just returns an empty array — user has no idea RCA exists. A tip pointing to checkly account plan when the entitlement is missing would be a natural upsell moment.
  3. CLI can't trigger RCA (future). The public POST endpoint exists but CLI is read-only. A --analyze flag would close the loop.

2 is worth looking into imo ? might be easy to pull in given the earlier plan/entitlement work

I'll check 1.

2 is a good point: it is verified by the agent in general, but worth distinguishing if possible. The problem I anticipate is that this requires an extra call to your entitlements to have that context, and that's kind of parallel to the call getting the error groups... So I don't necessarily think there's a simple solution to this. I'd lean towards leaving as is, and the attempt at calling the RCA itself, returns a 402 payment required hence push to upgrade as agents seeing this should naturally turn to the account plan command?

3 is done in a separate branch sitting on top of this one already 🤓

@thebiglabasky
Copy link
Copy Markdown
Contributor Author

@MichaelHogers re: 1

The fields exist in the database but are not exposed in the public API — the Joi schema with stripUnknown strips them out.

So for linking --result back to an error group, there are two options:

A) Add errorGroupId/errorGroupIds to the public check results schema — small API change, adds the field to the response. Then the CLI can show "Error group: checkly checks
get --error-group " in the result detail navigation hints. This is a backend change.

I created an issue for it. So I'll need to make a BE change before we can fix this. I think it's worth doing so the command makes sense.

thebiglabasky and others added 10 commits April 2, 2026 14:00
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thebiglabasky thebiglabasky enabled auto-merge (squash) April 2, 2026 12:23
thebiglabasky and others added 3 commits April 2, 2026 14:36
The public RCA GET endpoint now returns 202 while analysis is in
progress and 404 only for genuinely missing resources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts:
#	packages/cli/src/formatters/__tests__/__snapshots__/checks.spec.ts.snap
#	packages/cli/src/formatters/__tests__/rca.spec.ts
#	packages/cli/src/formatters/checks.ts
#	packages/cli/src/formatters/rca.ts
The assertion hardcoded a Unix-style path, failing on Windows CI
where path.join produces backslashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thebiglabasky thebiglabasky changed the title feat: surface root cause analysis in checks get command [AI-190] feat: add RCA commands and surface root cause analysis in checks get [AI-190] Apr 2, 2026
@thebiglabasky thebiglabasky force-pushed the herve/add-rca-check-results branch from b85401b to 0735a92 Compare April 2, 2026 13:04
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thebiglabasky thebiglabasky force-pushed the herve/add-rca-check-results branch from 0735a92 to ec7172e Compare April 2, 2026 13:06
Comment thread packages/cli/src/commands/rca/get.ts Outdated
Show a suggestion to run RCA for error groups that don't have one yet
in the checks get terminal output. Move the shared polling logic into
the Rca API client class to deduplicate across rca run and rca get.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@thebiglabasky thebiglabasky merged commit 9bdcdb2 into main Apr 2, 2026
4 checks passed
@thebiglabasky thebiglabasky deleted the herve/add-rca-check-results branch April 2, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants