test(kernel-node-runtime): Fix flaky remote-comms E2E tests#927
Merged
test(kernel-node-runtime): Fix flaky remote-comms E2E tests#927
Conversation
Increase ackTimeoutMs from 2s to 5s for two tests whose RemoteHandle was giving up (after 4×ackTimeoutMs = 8s) before kernel2 could restart + reconnect through the relay. Also restructure the "resolves promise after reconnection" test to establish the connection before the disconnect/reconnect cycle, eliminating a URL redemption race condition. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Coverage Report
File CoverageNo changed files found. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Two E2E tests in
remote-comms.test.tswere consistently flaky onmain, failing more often than passing:URL redemption timed out after 8000msRoot cause: Both tests used
testBackoffOptionswithackTimeoutMs: 2_000. TheRemoteHandlegives up afterackTimeoutMs × (MAX_RETRIES + 1) = 8s, which is too short for kernel2 to restart (new DB connection + kernel init + libp2p + relay connection + vat launch).Changes
ackTimeoutMsto5_000for both tests, giving a 20s give-up window instead of 8squeueTestOptions,reconnectOptions) for clarity and consistency betweensetupAliceAndBobandrestartKernelAndReloadVatcallsdelay()calls that were masking the underlying timing issueTesting
Both tests pass reliably after the fix. Verified by running the full
remote-comms.test.tsE2E suite twice — all 67 E2E tests pass across all 11 test files (including both previously-flaky tests).🤖 Generated with Claude Code
Note
Low Risk
Low risk: changes are limited to E2E test timing/configuration and message sequencing; main risk is masking real regressions by increasing timeouts or altering test ordering.
Overview
Stabilizes
remote-comms.test.tsE2E coverage by introducing per-test option objects that overrideackTimeoutMs(to5_000) and related rate-limit settings, and then using those same options consistently acrosssetupAliceAndBobandrestartKernelAndReloadVat.Reworks the "reconnect without exhausting retries" test to first establish a connection (complete URL redemption), then stop the remote kernel and queue the message while it’s definitely offline, removing race-prone delays and making the queued-message reconnection behavior deterministic.
Reviewed by Cursor Bugbot for commit 11d93ec. Bugbot is set up for automated code reviews on this repo. Configure here.