Skip to content

test_runner: add flaky option to retry on failure#63661

Open
Han5991 wants to merge 3 commits into
nodejs:mainfrom
Han5991:feat/test-runner-flaky
Open

test_runner: add flaky option to retry on failure#63661
Han5991 wants to merge 3 commits into
nodejs:mainfrom
Han5991:feat/test-runner-flaky

Conversation

@Han5991
Copy link
Copy Markdown
Contributor

@Han5991 Han5991 commented May 30, 2026

This feature is based on the flaky proposal in the nodejs/test-runner repo, which I used as the reference for the API surface and semantics. A test can be marked flaky so the runner re-runs it until it passes — useful for tests with unavoidable nondeterminism.

Supersedes #61746 (original work by @vespa7, credited as co-author); this is a fresh implementation on current main.

What it adds

  • flaky: true → retry up to 20 times; flaky: <positive integer> → explicit budget; flaky: false → off. Invalid values throw at registration.
  • Works on tests and suites, plus the it.flaky / test.flaky / describe.flaky / suite.flaky shorthands. Nearest value wins (a test-case overrides its suite).
  • Only the final attempt is observable: intermediate failures emit no test:fail, no per-test diagnostics, and nothing on the node.test error channel.
  • New retryCount field on test:pass / test:fail / test:complete; reporters print a # FLAKY … directive; the run summary gains a flaky counter.
  • beforeEach/afterEach re-run per attempt (state reset between retries); before/after run once. External abort and expectFailure are not retried.

Points I'd appreciate eyes on

I've left inline review comments on the specific lines that would most benefit from review — the flaky-scoped assertion-plan read, the subtest crash fix and its known limitation, and the run() process-isolation parent re-counting.

Testing

New test/parallel/test-runner-flaky.js with deterministic, counter-based fixtures, plus a reporter snapshot test. Full test_runner suite passes; lint clean.

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/test_runner

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels May 30, 2026
Copy link
Copy Markdown
Contributor Author

@Han5991 Han5991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few self-review notes on the trickier parts, anchored to the exact lines.

// (at first `.assert` access), preserving the historical counting behavior
// exactly - opting out of `flaky` changes nothing.
const capturedPlan = test.plan;
const currentPlan = () => (test.flakyRetries > 0 ? test.plan : capturedPlan);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one change that touches the non-flaky path: the assertion plan is now read at use-time, but only for flaky tests (test.flakyRetries > 0); non-flaky tests keep the registration-time capture (capturedPlan). Please confirm non-flaky plan counting is unchanged — the regression guard is the assert-before-plan case in test/parallel/test-runner-flaky.js.

// final attempt creates no subtest - leaving outputSubtestCount > 0 with
// an empty `subtests`, which crashed report() (see the guard there).
this.subtests = [];
this.outputSubtestCount = 0;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtest crash fix (1/2): on retry we clear subtests and reset outputSubtestCount. Without the reset the count stayed > 0 while subtests was emptied, which crashed the reporter (see the guard in 2/2).

// `subtests` can be empty here when a flaky retry cleared it even though
// outputSubtestCount was non-zero; fall back to this test's own nesting
// rather than dereferencing subtests[0] (which would crash the run).
const subtestNesting = this.subtests.length ?
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subtest crash fix (2/2) + known limitation: this guards the empty-subtests dereference that aborted the run. Known limitation: intermediate-attempt subtests are emitted to the stream before the retry decision, so they cannot be recalled — the output can still contain leaked subtest lines and a slightly inflated # tests. Full fidelity would need buffering each attempt's events and replaying only the surviving one. Is the crash-fix-only version OK to land, with fidelity as a follow-up?

cancelled: kCanceledTests.has(item.data.details?.error?.failureType),
// A retried timeout that exhausted (retryCount > 0) is a failure, not a
// cancellation only an un-retried timeout/abort stays cancelled.
cancelled: kCanceledTests.has(item.data.details?.error?.failureType) &&
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run() process-isolation parent re-counts the child's serialized events, so two things are handled here: an exhausted flaky timeout (retryCount > 0) is promoted out of cancelled into a failure, and any flaky-marked test (retryCount present, even 0) counts toward flaky. This cross-process re-derivation is easy to get subtly wrong — please sanity-check.

publishError(err);
}
if (isTestFailureError(err)) {
if (err.failureType === kTestTimeoutFailure && this.flakyRetries === 0) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout handling: a non-flaky timeout stays cancelled (#cancel), but a flaky one goes through fail() so it can retry and, when the budget is exhausted, counts as a failure rather than a cancellation. The flakyRetries === 0 guard is what preserves the existing non-flaky behavior.

Comment thread lib/internal/test_runner/test.js Outdated
this.expectedAssertions = plan;
this.cancelled = false;
if (flaky === undefined || flaky === null) {
this.flakyRetries = this.parent?.flakyRetries ?? 0;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics per the proposal: nearest-wins inheritance (this parent fallback), flaky: true → 20 retries (kDefaultFlakyRetries, L106; set just below), an explicit positive integer is used as-is, and 0/negative/non-integer values throw via validateInteger. Confirming these defaults match the proposal's intent would help.

@Han5991 Han5991 force-pushed the feat/test-runner-flaky branch from 8ec070e to b9fa345 Compare May 30, 2026 15:19
@Han5991 Han5991 marked this pull request as ready for review May 30, 2026 15:19
Add a `flaky` option that re-runs a failing test until it passes,
intended for tests with unavoidable nondeterminism. Setting
`flaky: true` retries up to 20 times; `flaky: <positive integer>`
sets an explicit retry budget. The option is accepted on tests and
suites and via the it.flaky/test.flaky/describe.flaky/suite.flaky
shorthands; a test-case value overrides an inherited suite value
(nearest wins), and `flaky: false` opts a test out.

Only the final attempt is observable: intermediate failures emit no
test:fail, no per-test diagnostics, and nothing on the node.test
error channel. Each result carries a new `retryCount` field on the
test:pass, test:fail, and test:complete events (the number of retries
performed, `undefined` for non-flaky tests), reporters print a
`# FLAKY` directive, and the run summary gains a `flaky` counter.

beforeEach/afterEach re-run on every attempt while before/after run
once, so per-attempt state is reset and retries do not leak state.
An externally aborted test and an expectFailure are not retried. A
flaky test whose timeout is exhausted is reported as a failure rather
than cancelled.

Co-authored-by: vespa7 <98526766+vespa7@users.noreply.github.com>
Signed-off-by: sangwook <rewq5991@gmail.com>
@Han5991 Han5991 force-pushed the feat/test-runner-flaky branch from b9fa345 to 454ccc0 Compare May 30, 2026 15:20
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

❌ Patch coverage is 92.04152% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.31%. Comparing base (30a7e28) to head (a1989d8).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
lib/internal/test_runner/reporter/junit.js 0.00% 10 Missing ⚠️
lib/internal/test_runner/test.js 97.08% 5 Missing and 2 partials ⚠️
lib/internal/test_runner/reporter/utils.js 16.66% 4 Missing and 1 partial ⚠️
lib/internal/test_runner/reporter/dot.js 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63661      +/-   ##
==========================================
- Coverage   91.95%   90.31%   -1.64%     
==========================================
  Files         379      732     +353     
  Lines      166454   236652   +70198     
  Branches    25427    44575   +19148     
==========================================
+ Hits       153058   213730   +60672     
- Misses      13104    14617    +1513     
- Partials      292     8305    +8013     
Files with missing lines Coverage Δ
lib/internal/test_runner/harness.js 88.15% <100.00%> (+0.02%) ⬆️
lib/internal/test_runner/reporter/tap.js 95.00% <100.00%> (+0.11%) ⬆️
lib/internal/test_runner/runner.js 94.50% <100.00%> (+0.38%) ⬆️
lib/internal/test_runner/tests_stream.js 92.64% <100.00%> (+0.14%) ⬆️
lib/internal/test_runner/utils.js 65.46% <100.00%> (+0.22%) ⬆️
lib/internal/test_runner/reporter/dot.js 97.72% <80.00%> (-2.28%) ⬇️
lib/internal/test_runner/reporter/utils.js 90.09% <16.66%> (-5.20%) ⬇️
lib/internal/test_runner/test.js 97.72% <97.08%> (-0.13%) ⬇️
lib/internal/test_runner/reporter/junit.js 89.01% <0.00%> (-5.47%) ⬇️

... and 481 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Han5991 added 2 commits May 31, 2026 01:45
Add failing regression tests for nine defects found while reviewing the
flaky feature, each asserting the corrected behavior:

- a flaky + expectFailure test that unexpectedly passes must not retry
- a flaky parent must retry when only a subtest fails
- node.test tracing start/end must stay balanced across retries
- flaky must not propagate from a test to its own subtests
- the MockTracker must be reset between retries
- a test-level after hook must run after the final attempt
- a retried attempt must abort the previous attempt's signal
- an in-flight subtest from a discarded attempt must not corrupt the run
- a flaky + expectFailure error must still reach the node.test channel

Signed-off-by: sangwook <rewq5991@gmail.com>
Fix eight defects in the flaky retry loop, each covered by a regression
test added in the previous commit:

- do not retry a flaky expectFailure test that unexpectedly passes
- retry a flaky parent when a failure surfaces only through a subtest
- publish the tracing channel end per attempt so spans stay balanced
- inherit flaky only from a suite parent, not a test to its subtests
- reset the MockTracker between retries
- run a test-level after hook only after the final attempt
- abort the previous attempt's signal so its work is torn down
- publish a flaky expectFailure throw's error to the node.test channel

The remaining intermediate-attempt subtest output inflation across
retries is the documented known limitation and is unchanged.

Signed-off-by: sangwook <rewq5991@gmail.com>
@Han5991 Han5991 force-pushed the feat/test-runner-flaky branch from 3ff2a6b to a1989d8 Compare May 30, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants