Skip to content

fix: lazy_load silently breaking queues without stream_populate#400

Open
andheiberg wants to merge 1 commit into
mainfrom
fix-lazy-load-with-non-streaming-queue
Open

fix: lazy_load silently breaking queues without stream_populate#400
andheiberg wants to merge 1 commit into
mainfrom
fix-lazy-load-with-non-streaming-queue

Conversation

@andheiberg

Copy link
Copy Markdown

When lazy_load=true but the queue does not implement stream_populate (e.g. CI::Queue::Bisect), load_tests skipped requiring test files, leaving Minitest.loaded_tests empty. populate_queue then called queue.populate([]), so Bisect#failing_test_present? always returned nil, causing the bisect runner to exit with "The failing test does not exist." despite the test being a valid, runnable test.

Fix: only skip eager file loading when the queue actually supports stream_populate. When it does not, fall through to the eager path so Minitest.loaded_tests is populated before populate is called.

When lazy_load=true but the queue does not implement stream_populate (e.g.
CI::Queue::Bisect), load_tests skipped requiring test files, leaving
Minitest.loaded_tests empty. populate_queue then called queue.populate([]),
so Bisect#failing_test_present? always returned nil, causing the bisect
runner to exit with "The failing test does not exist." despite the test
being a valid, runnable test.

Fix: only skip eager file loading when the queue actually supports
stream_populate. When it does not, fall through to the eager path so
Minitest.loaded_tests is populated before populate is called.
@andheiberg andheiberg self-assigned this Apr 23, 2026
@andheiberg andheiberg changed the title Fix CI_QUEUE_LAZY_LOAD silently breaking queues without stream_populate fix: CI_QUEUE_LAZY_LOAD silently breaking queues without stream_populate Apr 23, 2026
@andheiberg andheiberg changed the title fix: CI_QUEUE_LAZY_LOAD silently breaking queues without stream_populate fix: lazy_load silently breaking queues without stream_populate Apr 23, 2026
robinmoneybird pushed a commit to robinmoneybird/ci-queue that referenced this pull request Jun 16, 2026
Bisect currently exits 1 in two cases that represent successful
diagnostic runs rather than failures:

  1. "The bisection was inconclusive, there might not be any leaky
     test here." - the bisect ran, narrowed candidates, and the
     failure did not reproduce on the final narrowed order. This is
     the expected outcome for genuinely flaky tests (timing, async,
     network) rather than order-dependent ones.

  2. "The failing test was the first test in the test order so there
     is nothing to bisect." - the bisect ran and reported that there
     are no preceding tests to suspect.

Both are valid findings produced by a successful run. Treating them as
build failures forces callers (e.g. CI pipelines that invoke bisect on
flaky tests) to permanently sit at a low pass rate even though bisect
is doing exactly what it is supposed to do, and makes it impossible to
distinguish these outcomes from real harness failures like the one
fixed in Shopify#400 (lazy_load leaving Minitest.loaded_tests empty so the
failing test could not be found).

After this change:

  exit 0 - bisect ran to completion (polluter found, inconclusive,
           failing test failed in isolation, or nothing to bisect)
  exit 1 - bisect could not run (FAILING_TEST not present in the
           queue, missing arguments, etc.)

Tests are updated to assert the contract for each scenario.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant