Refactor CLI tests: subprocess → integration, fast unit tests by amikofalvy · Pull Request #2944 · inkeep/agents

amikofalvy · 2026-04-01T06:14:54Z

Summary

Extract createProgram() from index.ts into program.ts so CLI command registration is importable without side effects (env loading, instrumentation)
Move subprocess-based cli.test.ts (real execSync spawning) to integration/cli-subprocess.test.ts
Replace with fast unit tests using Commander's exitOverride() and programmatic API — 14ms vs 33s (2,355x faster)
Integration tests chained in test script so they always run in CI after unit tests pass
Also available standalone via pnpm test:integration

Timing impact

Metric	Before	After
cli.test.ts execution	32,986ms (10 subprocess tests)	14ms (9 unit tests)
Integration tests	included in default run	chained after unit tests
CI behavior	unchanged — both unit + integration run

Files changed

File	Change
`src/program.ts`	New — extracted `createProgram()` from index.ts
`src/index.ts`	Simplified to env/instrumentation + `createProgram().parse()`
`src/__tests__/cli.test.ts`	Rewritten with mocked Commander tests
`src/__tests__/integration/cli-subprocess.test.ts`	New — moved original subprocess tests
`vitest.integration.config.ts`	New — config for integration tests
`vitest.config.ts`	Exclude `/integration/`
`vitest.config.ci.ts`	Exclude `/integration/`
`package.json`	Chain integration tests in `test` script, add `test:integration`

Test plan

Unit tests pass: pnpm test from agents-cli
Integration tests pass: pnpm test:integration from agents-cli
Full suite from root: pnpm test
CI runs both unit and integration tests

🤖 Generated with Claude Code

…it tests Extract `createProgram()` from index.ts into program.ts so CLI command registration can be imported without side effects (env loading, instrumentation). This enables fast unit testing via Commander's programmatic API. Move the subprocess-based cli.test.ts to integration/cli-subprocess.test.ts (real `execSync` spawning of the compiled binary). Replace with new unit tests that use Commander's `exitOverride()` — runs in 14ms vs 33s. Integration tests are chained in the `test` script so they run in CI after unit tests pass. Also available standalone via `pnpm test:integration`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel · 2026-04-01T06:15:00Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Apr 1, 2026 6:39am
agents-docs	Ready	Preview, Comment	Apr 1, 2026 6:39am
agents-manage-ui	Ready	Preview, Comment	Apr 1, 2026 6:39am

changeset-bot · 2026-04-01T06:15:00Z

⚠️ No Changeset found

Latest commit: 4d7b576

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

socket-security · 2026-04-01T06:16:08Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Package	Supply Chain Security	Quality	Maintenance
next@16.2.1
@nangohq/frontend@0.69.46 ⏵ 0.69.41
@nangohq/node@0.69.46 ⏵ 0.69.41
@inkeep/agents-ui-cloud@0.15.25
@nangohq/types@0.69.46 ⏵ 0.69.41
@inkeep/docskit@0.0.8
@inkeep/agents-ui@0.15.20 ⏵ 0.15.23	⁺¹	⁺¹	⁺¹
@inkeep/agents-ui@0.15.20 ⏵ 0.15.25	⁺¹	⁺¹	⁺¹
@ai-sdk/azure@3.0.49 ⏵ 3.0.7	⁺²⁶	^-3
@inkeep/agents-manage-ui@0.59.4 ⏵ 0.41.1	⁺¹	⁺¹
@babel/parser@7.29.2 ⏵ 7.29.0
@hono/swagger-ui@0.5.3 ⏵ 0.5.2	⁺¹	⁺¹	^-8
drizzle-orm@0.44.7
@hono/zod-openapi@1.2.3 ⏵ 1.2.0	⁺¹		^-3
@opentelemetry/exporter-jaeger@2.6.0 ⏵ 2.3.0	⁺²⁷		^-3
zustand@5.0.9
@hono/vite-dev-server@0.23.0 ⏵ 0.20.1			⁺⁴
@electric-sql/pglite@0.3.16 ⏵ 0.3.14	⁺⁵		^-3
@eslint/js@9.39.2
@changesets/cli@2.29.7
@anthropic-ai/mcpb@1.2.0
@aws-sdk/client-s3@3.1016.0 ⏵ 3.995.0	⁺¹
@openrouter/ai-sdk-provider@2.3.3 ⏵ 2.2.3	⁺²⁶		⁺¹
@modelcontextprotocol/sdk@1.28.0 ⏵ 1.26.0	⁺¹
@biomejs/biome@2.3.11

View full report

claude

PR Review Summary

(1) Total Issues | Risk: Low

🟡 Minor (1) 🟡

Inline Comments:

🟡 Minor: vitest.integration.config.ts:25 Missing retry: 2 for subprocess test flakiness

💭 Consider (2) 💭

💭 1) cli.test.ts Add handler invocation tests

Issue: The new unit tests verify command registration and option definitions via static inspection, but do not verify that command handlers are actually invoked when commands are parsed. All command modules are mocked with vi.fn() but never asserted.

Why: The original subprocess tests implicitly verified handler invocation by observing real output/errors. If a command registration accidentally omits .action(handler) or binds the wrong handler, these tests would not catch it. Low risk since integration tests still run.

Fix: Add tests that parse a command and verify the mocked handler was called:

it('should invoke pushCommand handler with parsed options', async () => {
  const { pushCommand } = await import('../commands/push');
  parseArgs(createProgram(), ['push', '--project', 'my-proj', '--force']);
  expect(pushCommand).toHaveBeenCalledWith(expect.objectContaining({ project: 'my-proj', force: true }));
});

Inline Comments:

💭 Consider: cli.test.ts:103 Add test for push --help output rendering

💡 APPROVE WITH SUGGESTIONS

Summary: Clean refactoring that delivers a 2,355x speedup for CLI tests. The extraction of createProgram() is well-structured with proper separation of side effects. Two small suggestions: add retry: 2 to integration config for flakiness resilience, and consider adding handler invocation assertions to unit tests for completeness. Nice work! 🚀

Discarded (2)

Location	Issue	Reason Discarded
`package.json:23`	Integration tests chained differently than agents-api pattern	Design choice documented in PR description — intentional deviation with valid justification
`cli.test.ts:145`	Subcommand registration test does not verify config/profile subcommands	Info-level completeness suggestion, not a bug or correctness issue

Reviewers (3)

Reviewer	Returned	Consider	Inline Comments	Discarded
`pr-review-standards`	0	0	0	0
`pr-review-devops`	2	0	1	1
`pr-review-tests`	3	1	1	1
Total	5	1	2	2

claude · 2026-04-01T06:18:14Z

agents-cli/vitest.integration.config.ts

+      },
+    },
+    maxConcurrency: 1,
+  },


🟡 Minor: Missing retry option for flaky subprocess tests

Issue: The integration test config does not include the retry: 2 option present in vitest.config.ci.ts.

Why: Subprocess-based tests are inherently more prone to flakiness (process spawn timing, I/O, timeouts). The CI config retries tests twice to handle transient failures. Without this, integration tests may produce false-negative failures in CI.

Fix:

Suggested change

},

maxConcurrency: 1,

retry: 2,

},

Refs:

vitest.config.ci.ts:35 — existing retry config

claude · 2026-04-01T06:18:19Z

agents-cli/src/__tests__/cli.test.ts

+      expect(pushCmd).toBeDefined();
+      expect(pushCmd!.description()).toContain('Push a project configuration');
    });
  });


💭 Consider: Add test for push --help output

Issue: The original integration tests verified that push --help displays correct content including --agents-api-url. The new unit tests inspect options via Commander's internal API but don't test the rendered help output.

Why: If help text generation breaks for subcommands, this would go undetected until integration tests run. A fast unit test provides earlier feedback.

Fix: Add after line 103:

it('should display help for push command', () => { const { stdout } = parseArgs(createProgram(), ['push', '--help']); expect(stdout).toContain('Push a project configuration'); expect(stdout).toContain('--agents-api-url'); });

Refs:

cli-subprocess.test.ts:154-160 — original test

pullfrog · 2026-04-01T06:18:52Z

TL;DR — Splits the agents-cli test suite into fast in-process unit tests and subprocess-based integration tests by extracting createProgram() into its own module. CI behavior is unchanged — both tiers run — but the default dev loop is dramatically faster since unit tests no longer spawn child processes.

Key changes

Extract createProgram() into program.ts — Moves all Commander command registration out of index.ts into a side-effect-free factory function, making the CLI command tree importable without triggering env loading or instrumentation
Rewrite cli.test.ts as fast unit tests — Replaces execSync subprocess spawning with Commander's exitOverride() and mocked command modules, exercising the CLI entirely in-process
Move original subprocess tests to integration/cli-subprocess.test.ts — Preserves all end-to-end tests that exercise the compiled binary, relocated to a dedicated integration directory
Add vitest.integration.config.ts and chain integration tests in test and test:ci — New config runs integration tests sequentially after unit tests with retry: 2 for flaky subprocess resilience; also available standalone via pnpm test:integration
Exclude **/integration/** from default vitest configs — Both vitest.config.ts and vitest.config.ci.ts skip integration tests so the fast unit suite runs in isolation

_{Summary ｜ 8 files ｜ 4 commits ｜ base: main ← refactor/cli-test-subprocess-to-integration}

Side-effect-free CLI program factory

Before: index.ts was a monolithic entrypoint — env loading, instrumentation, logger silencing, all Commander command registration, and program.parse() in a single file (~245 lines).
After: index.ts is a thin 15-line shell (side effects + createProgram().parse()), while program.ts exports the pure command tree factory.

The createProgram() function registers all 13 commands (push, pull, list-agent, config, dev, login, logout, status, profile, add, init, update, whoami) and returns the configured Command instance. No runtime behavior changes — the extraction is purely structural.

program.ts · index.ts

In-process unit tests via Commander's programmatic API

Before: Every test assertion spawned a child process (execSync node ./dist/index.js), requiring a built binary on disk, 30s timeouts, and global.gc() cleanup between runs.
After: Tests import createProgram() directly, mock all command modules with vi.mock(), and exercise the CLI through Commander's parse() with captured stdout/stderr.

The rewritten suite validates version output, help text, command registration (all 13 commands), option definitions, unknown-command error handling, and CLI naming — all without touching the filesystem or spawning processes. Uses optional chaining (?.) throughout for safe property access on Commander objects.

cli.test.ts

Two-tier test execution strategy

Before: pnpm test ran all tests (including slow subprocess tests) in a single vitest invocation.
After: pnpm test chains unit tests first, then integration tests. Integration tests are also available standalone via pnpm test:integration.

Script	Runs
`pnpm test`	Unit tests → Integration tests (both must pass)
`pnpm test:ci`	CI unit tests → Integration tests (both must pass)
`pnpm test:integration`	Integration tests only

The integration config enforces sequential execution (singleThread: true, maxConcurrency: 1) with a 120s test timeout and retry: 2 to accommodate subprocess overhead and flakiness.

package.json · vitest.integration.config.ts · vitest.config.ts · vitest.config.ci.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ Using Claude Opus ｜ 𝕏}

pullfrog

Clean refactor — the createProgram() extraction is correct, the unit tests are dramatically faster, and there's no behavior change to the CLI itself. Two things need attention before merge.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ Using Claude Opus ｜ 𝕏}

pullfrog · 2026-04-01T06:19:39Z

agents-cli/package.json

    "test:debug": "vitest --run --reporter=verbose --no-coverage",
    "test:watch": "vitest",
    "test:coverage": "node -e \"process.exit(process.env.CI ? 0 : 1)\" && vitest --run --coverage --config vitest.config.ci.ts || vitest --run --coverage",
    "test:ci": "vitest --run --config vitest.config.ci.ts",


test:ci only runs the CI unit config and never chains integration tests. If any CI workflow calls test:ci directly (instead of test), integration tests silently disappear.

Either chain integration here too:

"test:ci": "vitest --run --config vitest.config.ci.ts && vitest --run --config vitest.integration.config.ts",

…or rename it to test:unit:ci so the omission is explicit.

pullfrog · 2026-04-01T06:19:39Z

agents-cli/src/__tests__/integration/cli-subprocess.test.ts

+    await new Promise((resolve) => setTimeout(resolve, 100));
+    // Force garbage collection to clean up any hanging references
+    if (global.gc) {


The 100ms sleep and global.gc() call were workarounds for subprocess-spawned processes not terminating cleanly. In the integration suite these same subprocess tests still exist so the sleep is arguably still relevant — but global.gc() is a no-op unless Node is started with --expose-gc, which none of the configs pass. Consider removing the GC block to avoid giving the impression it does something.

pullfrog · 2026-04-01T06:19:39Z

agents-cli/vitest.config.ci.ts

        vmThreads: false,
      },
    },
+    // Exclude integration tests (run separately via test:integration)


Nit: repo convention is no comments unless explicitly requested (AGENTS.md). This comment restates what the config key already says.

Suggested change

// Exclude integration tests (run separately via test:integration)

exclude: ['node_modules', 'dist', '**/integration/**'],

- Add retry: 2 to integration config for flaky subprocess tests - Chain integration tests in test:ci script - Remove no-op global.gc() from subprocess test cleanup - Remove unnecessary comment in CI config Skipping push --help unit test: Commander subcommand help doesn't propagate through parent's configureOutput(), so the programmatic option inspection is more reliable for unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

PR Review Summary

(0) Total Issues | Risk: Low

All prior review feedback has been addressed in commit 66c77e8f2:

Prior Feedback	Status
🟡 Missing `retry: 2` in integration config	✅ Added at line 25
🔴 `test:ci` missing integration test chain	✅ Fixed in package.json
💭 Dead `global.gc()` code	✅ Removed from cli-subprocess.test.ts
💭 Comment violating repo convention	✅ Removed from vitest.config.ci.ts

🕐 Pending Recommendations (0)

None — all prior feedback has been addressed.

✅ APPROVE

Summary: Clean refactoring that delivers a 2,355x speedup for CLI tests (14ms vs 33s). The extraction of createProgram() is well-structured with proper separation of side effects. All prior review feedback has been addressed — retry: 2 added, test:ci now chains integration tests, dead global.gc() removed, and the stray comment eliminated. Ship it! 🚀

Reviewers (0)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
Total	0	0	0	0	0	0	0

Note: No sub-reviewers dispatched for this re-review — delta contained only fix commits addressing prior feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

PR Review Summary

(0) Total Issues | Risk: Low

Delta Analysis

The only change since the last review (66c77e8f2) is commit 4d7b5764 which replaces 4 non-null assertions (!) with optional chaining (?.) in cli.test.ts:

- expect(pushCmd!.description()).toContain(...)
+ expect(pushCmd?.description()).toContain(...)

This is a pure lint fix — no behavioral changes, no logic modifications. The assertions still verify the same conditions, just with safer syntax that satisfies Biome's style rules.

Prior Feedback Status

All feedback from previous reviews has been addressed:

Prior Feedback	Status
🟡 Missing `retry: 2` in integration config	✅ Addressed in `66c77e8f2`
🔴 `test:ci` missing integration test chain	✅ Addressed in `66c77e8f2`
💭 Dead `global.gc()` code	✅ Addressed in `66c77e8f2`
💭 Comment violating repo convention	✅ Addressed in `66c77e8f2`

✅ APPROVE

Summary: The delta since last review is a trivial lint fix (non-null assertions → optional chaining). All prior review feedback remains addressed. This PR is ready to merge. 🚀

Reviewers (0)

Reviewer	Returned	Main Findings	Consider	While You're Here	Inline Comments	Pending Recs	Discarded
Total	0	0	0	0	0	0	0

Note: No sub-reviewers dispatched — delta was a trivial lint fix with no behavioral changes.

vercel bot deployed to Preview – agents-api April 1, 2026 06:17 View deployment

vercel bot deployed to Preview – agents-manage-ui April 1, 2026 06:17 View deployment

vercel bot deployed to Preview – agents-docs April 1, 2026 06:17 View deployment

claude bot reviewed Apr 1, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Apr 1, 2026

pullfrog bot reviewed Apr 1, 2026

View reviewed changes

Merge branch 'main' into refactor/cli-test-subprocess-to-integration

34b6ef6

vercel bot deployed to Preview – agents-manage-ui April 1, 2026 06:30 View deployment

github-actions bot deleted a comment from claude bot Apr 1, 2026

vercel bot deployed to Preview – agents-manage-ui April 1, 2026 06:33 View deployment

vercel bot deployed to Preview – agents-docs April 1, 2026 06:33 View deployment

vercel bot deployed to Preview – agents-api April 1, 2026 06:33 View deployment

claude bot reviewed Apr 1, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Apr 1, 2026

Fix lint: replace non-null assertions with optional chaining

4d7b576

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel bot deployed to Preview – agents-manage-ui April 1, 2026 06:38 View deployment

claude bot reviewed Apr 1, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Apr 1, 2026

vercel bot deployed to Preview – agents-api April 1, 2026 06:39 View deployment

vercel bot deployed to Preview – agents-docs April 1, 2026 06:39 View deployment

amikofalvy enabled auto-merge April 1, 2026 06:43

amikofalvy added this pull request to the merge queue Apr 1, 2026

Merged via the queue into main with commit 2752b13 Apr 1, 2026
26 checks passed

amikofalvy deleted the refactor/cli-test-subprocess-to-integration branch April 1, 2026 06:55

	// Exclude integration tests (run separately via test:integration)
	exclude: ['node_modules', 'dist', '/integration/'],

Conversation

amikofalvy commented Apr 1, 2026

Summary

Timing impact

Files changed

Test plan

Uh oh!

vercel bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

socket-security bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟡 Minor (1) 🟡

💭 Consider (2) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changes

Side-effect-free CLI program factory

In-process unit tests via Commander's programmatic API

Two-tier test execution strategy

Uh oh!

pullfrog bot left a comment

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🕐 Pending Recommendations (0)

✅ APPROVE

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Analysis

Prior Feedback Status

✅ APPROVE

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Apr 1, 2026 •

edited

Loading

changeset-bot bot commented Apr 1, 2026 •

edited

Loading

socket-security bot commented Apr 1, 2026 •

edited

Loading

pullfrog bot commented Apr 1, 2026 •

edited

Loading