feat(server): add keepAliveInterval for standalone GET SSE stream#1726
feat(server): add keepAliveInterval for standalone GET SSE stream#1726rcdailey wants to merge 6 commits intomodelcontextprotocol:mainfrom
Conversation
🦋 Changeset detectedLatest commit: 5fcb451 The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
@claude review |
941226e to
5a925e1
Compare
|
@claude review |
d29e61f to
6a2fd6f
Compare
The replayEvents() code path (client reconnects with Last-Event-ID) was missing keepalive timer setup, so reconnecting clients would lose keepalive protection and get dropped again at the next proxy idle timeout. Also fixes the cleanup test to actually prove that close() clears the timer by asserting vi.getTimerCount() drops to 0, instead of relying on the catch fallback which would self-clear anyway. Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
|
@claude review |
The replayEvents() code path (client reconnects with Last-Event-ID) was missing keepalive timer setup, so reconnecting clients would lose keepalive protection and get dropped again at the next proxy idle timeout. Also fixes the cleanup test to actually prove that close() clears the timer by asserting vi.getTimerCount() drops to 0, instead of relying on the catch fallback which would self-clear anyway. Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
d83942e to
8af2d12
Compare
|
@claude review |
The replayEvents() code path (client reconnects with Last-Event-ID) was missing keepalive timer setup, so reconnecting clients would lose keepalive protection and get dropped again at the next proxy idle timeout. Also fixes the cleanup test to actually prove that close() clears the timer by asserting vi.getTimerCount() drops to 0, instead of relying on the catch fallback which would self-clear anyway. Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
Adds defensive _clearKeepAliveTimer() before setInterval in handleGetRequest() to prevent duplicate timers when the GET stream is reinitialized. Moves _clearKeepAliveTimer() inside the if(stream) guard in closeStandaloneSSEStream() so it only clears when the standalone stream is being torn down, preventing accidental clearing of a replay stream's timer. Closes modelcontextprotocol#1726
8af2d12 to
ccbc773
Compare
|
@claude review |
|
@felixweinberger The bot is identifying a real limitation (single shared Could you take a look at the current state and let me know if this is mergeable, or if there's anything else you'd like changed? |
Yes will take a look thanks |
Adds an opt-in keepAliveInterval option to WebStandardStreamableHTTPServerTransportOptions that sends periodic SSE comments (`: keepalive`) on the standalone GET SSE stream. Reverse proxies commonly close connections that are idle for 30-60s. With no server-initiated messages, the GET SSE stream has no traffic during quiet periods, causing silent disconnections. This option lets operators send harmless SSE comments at a configurable cadence to keep the connection alive. The timer is cleared on close(), closeStandaloneSSEStream(), and on stream cancellation. Disabled by default; no behavior change for existing deployments. Addresses upstream modelcontextprotocol#28, modelcontextprotocol#876.
The replayEvents() code path (client reconnects with Last-Event-ID) was missing keepalive timer setup, so reconnecting clients would lose keepalive protection and get dropped again at the next proxy idle timeout. Also fixes the cleanup test to actually prove that close() clears the timer by asserting vi.getTimerCount() drops to 0, instead of relying on the catch fallback which would self-clear anyway. Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
Prevent timer handle leaks when concurrent reconnect requests bypass the conflict check. This edge case is possible when EventStore omits the optional getStreamIdForEventId method, allowing duplicate replay attempts to start new timers without cleaning up existing ones.
Adds defensive _clearKeepAliveTimer() before setInterval in handleGetRequest() to prevent duplicate timers when the GET stream is reinitialized. Moves _clearKeepAliveTimer() inside the if(stream) guard in closeStandaloneSSEStream() so it only clears when the standalone stream is being torn down, preventing accidental clearing of a replay stream's timer. Closes modelcontextprotocol#1726
ccbc773 to
1a750b9
Compare
|
Thanks Felix. My latest push is a rebase to resolve conflicts with main; no additional changes were made. |
|
@claude review |
|
I think the bot comment is real, pushed a potential fix with an additional regression test |
There was a problem hiding this comment.
LGTM — the per-stream keepAliveTimer ownership in StreamMapping correctly resolves the cross-stream timer interference I raised last round, and the new concurrent-stream regression test validates it.
Extended reasoning...
Overview
This PR adds an opt-in keepAliveInterval option to WebStandardStreamableHTTPServerTransport that sends periodic SSE comments to prevent reverse-proxy idle-timeout disconnections. Three files are modified: the changeset descriptor, streamableHttp.ts, and its test file.
Security risks
No security-sensitive code is touched. The feature adds outbound SSE comment writes guarded by a try/catch; there is no new input surface, no auth changes, and no data exposure risk.
Level of scrutiny
This is a new opt-in transport-layer feature that required several rounds of iteration. The logic is now self-contained in handleGetRequest and replayEvents, and the final design (per-stream ownership via StreamMapping.keepAliveTimer) is the correct approach for isolating concurrent stream lifetimes.
Other factors
After four rounds of review and fixes, the substantive bugs have all been addressed: replay-path keepalives were added, the closeStandaloneSSEStream guard was corrected, and the shared-timer cross-cancellation hazard was eliminated by moving to per-stream ownership. A dedicated regression test (should maintain independent keepalive timers per concurrent stream) directly exercises the concurrent scenario. The only open item is a JSDoc documentation nit (inline comment already filed) with no runtime impact. The feature is backwards-compatible and fully opt-in.
| /** | ||
| * Interval in milliseconds for sending SSE keepalive comments on the standalone | ||
| * GET SSE stream. When set, the transport sends periodic SSE comments | ||
| * (`: keepalive`) to prevent reverse proxies from closing idle connections. | ||
| * | ||
| * Disabled by default (no keepalive comments are sent). | ||
| */ | ||
| keepAliveInterval?: number; |
There was a problem hiding this comment.
🟡 The JSDoc for keepAliveInterval (line 147) says it sends keepalive comments on the standalone GET SSE stream, but the implementation also starts the timer for reconnecting clients via the Last-Event-ID / replayEvents() path. The description should be updated to reflect that the option applies to all GET-initiated SSE streams, not just the initial standalone one.
Extended reasoning...
What the inaccuracy is
The JSDoc for keepAliveInterval reads: "Interval in milliseconds for sending SSE keepalive comments on the standalone GET SSE stream." This is factually incomplete. The same PR also added keepalive timer support inside replayEvents() (the Last-Event-ID reconnect path), so keepAliveInterval governs keepalive behaviour on both GET stream types.
The specific code evidence
In handleGetRequest() (around line 500 of the diff), when this._keepAliveInterval !== undefined, a setInterval is stored in mapping.keepAliveTimer. In replayEvents() (around line 600), an identical block assigns a local keepAliveTimer and stores it in mapping.keepAliveTimer. Both paths read the same this._keepAliveInterval field. The changeset file .changeset/add-sse-keepalive.md repeats the same partial description: "sends periodic SSE comments on the standalone GET stream".
Why existing code does not prevent confusion
A developer reading only the JSDoc would conclude that reconnecting clients (those sending Last-Event-ID) do not benefit from the keepalive option, and might add their own keepalive logic elsewhere, leading to duplicate timers or misfiled bugs.
Impact
There is no runtime impact -- the code works correctly for both stream types and users actually get more coverage than advertised. The issue is purely documentation: a developer relying on the JSDoc would have an incomplete mental model of the feature scope.
Concrete proof
- Developer reads the JSDoc: "sends SSE keepalive comments on the standalone GET SSE stream."
- Developer sets
keepAliveInterval: 15_000intending to keep all GET-based connections alive. - A client reconnects with
Last-Event-ID-> entersreplayEvents()-> timer is started there too. - Developer is unaware replay streams also have keepalives, potentially adding redundant logic or assuming they are missing.
Addressing the refutation
One verifier argued this is not worth reporting because the behavior is more generous than advertised. While true that no user suffers broken behavior, a JSDoc that describes only half of a new feature's scope is a genuine documentation bug. The fix is a one-line word change and the cost of the inaccuracy (misleading developers) outweighs the cost of correcting it.
How to fix
Change "on the standalone GET SSE stream" to "on all GET-initiated SSE streams (both the initial standalone stream and reconnecting Last-Event-ID streams)". Update the changeset description similarly.
Summary
Adds an opt-in
keepAliveIntervaloption toWebStandardStreamableHTTPServerTransportOptions. When set, the transport sends periodic SSE comments (: keepalive\n\n) on the standalone GET SSE stream to reset proxy idle timers and prevent silent disconnections.SSE comments are ignored by spec-compliant clients but keep the connection alive through reverse proxies (nginx, Envoy, HAProxy, cloud load balancers) that enforce idle timeouts.
Usage
Omitting the option preserves existing behavior (no keepalive, fully backwards compatible).
What changed
keepAliveInterval?: numberadded to the options interface (milliseconds, disabled by default)handleGetRequeststarts asetIntervalthat enqueues: keepalive\n\nto the stream controllerclose(), andcloseStandaloneSSEStream()NodeStreamableHTTPServerTransportwrapper picks this up automatically (it type-aliases and passes through the options)Related issues
Ref #28 (server-side ping automation, P1)
Ref #876 (SSE connections drop after ~5 min idle behind proxies)
This is complementary to protocol-level ping approaches (like PR #1717). SSE comments operate at the transport layer and don't require JSON-RPC round-trips, making them a lightweight way to keep the connection alive independently of protocol-level health checks.