feat: add RCA commands and surface root cause analysis in checks get [AI-190]#1271
feat: add RCA commands and surface root cause analysis in checks get [AI-190]#1271thebiglabasky merged 19 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MichaelHogers
left a comment
There was a problem hiding this comment.
Three observations from testing with CC:
-
--result is a dead end. No link back to the error group or its RCA. Adding Error group: checkly checks get --error-group would help, especially for agents navigating programmatically. (need to double check this manually)
-
RCA: - is ambiguous. Could mean not entitled, not triggered, or not available. For non-entitled accounts the API just returns an empty array — user has no idea RCA exists. A tip pointing to checkly account plan when the entitlement is missing would be a natural upsell moment.
-
CLI can't trigger RCA (future). The public POST endpoint exists but CLI is read-only. A --analyze flag would close the loop.
2 is worth looking into imo ? might be easy to pull in given the earlier plan/entitlement work
I'll check 1. 2 is a good point: it is verified by the agent in general, but worth distinguishing if possible. The problem I anticipate is that this requires an extra call to your entitlements to have that context, and that's kind of parallel to the call getting the error groups... So I don't necessarily think there's a simple solution to this. I'd lean towards leaving as is, and the attempt at calling the RCA itself, returns a 402 payment required hence push to upgrade as agents seeing this should naturally turn to the account plan command? 3 is done in a separate branch sitting on top of this one already 🤓 |
|
@MichaelHogers re: 1 The fields exist in the database but are not exposed in the public API — the Joi schema with stripUnknown strips them out. So for linking --result back to an error group, there are two options: A) Add errorGroupId/errorGroupIds to the public check results schema — small API change, adds the field to the response. Then the CLI can show "Error group: checkly checks I created an issue for it. So I'll need to make a BE change before we can fix this. I think it's worth doing so the command makes sense. |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The public RCA GET endpoint now returns 202 while analysis is in progress and 404 only for genuinely missing resources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # packages/cli/src/formatters/__tests__/__snapshots__/checks.spec.ts.snap # packages/cli/src/formatters/__tests__/rca.spec.ts # packages/cli/src/formatters/checks.ts # packages/cli/src/formatters/rca.ts
The assertion hardcoded a Unix-style path, failing on Windows CI where path.join produces backslashes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b85401b to
0735a92
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0735a92 to
ec7172e
Compare
Show a suggestion to run RCA for error groups that don't have one yet in the checks get terminal output. Move the shared polling logic into the Rca API client class to deduplicate across rca run and rca get. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
RCA in checks get
Yes/-) to the error groups table inchecks get <id>so users can see at a glance which error groups have root cause analysis availablechecks get <id> --error-group <egId>latestRootCauseAnalysis(singular, most recent) +rootCauseAnalysisCountinstead of the raw array, to avoid surfacing potentially contradictory historical analyses to agentsRCA trigger and get commands
checkly rca run --error-group <id>to trigger a root cause analysis, with--watchto poll until completecheckly rca get <rcaId>to retrieve an existing analysis, with--watchto wait if still generating--output detail|json|mdformatsRCA API client
rca.trigger(errorGroupId)— POST to/v1/root-cause-analyses/error-groups/{errorGroupId}rca.get(id)— GET from/v1/root-cause-analyses/{id}Bug fix
initcommand test — path assertion was hardcoded with Unix separatorsScreenshot
Test plan
checkly rca run --error-group <egId> --watch— verify it triggers and polls until completecheckly rca get <rcaId>— verify it retrieves a completed analysischeckly rca get <rcaId> --watch— verify it polls on 202 and displays result on completioncheckly checks get <checkId>— verify RCA and Error Group ID columns appear in error groups tablecheckly checks get <checkId> --error-group <egId>with an error group that has RCA — verify full RCA detail renderscheckly checks get <checkId> --output json— verifylatestRootCauseAnalysisandrootCauseAnalysisCountin error groupsnpx vitest --runpasses all tests🤖 Generated with Claude Code