Skip to content

feat(server): add keepAliveInterval for standalone GET SSE stream#1726

Open
rcdailey wants to merge 6 commits intomodelcontextprotocol:mainfrom
rcdailey:feat/sse-keepalive-interval
Open

feat(server): add keepAliveInterval for standalone GET SSE stream#1726
rcdailey wants to merge 6 commits intomodelcontextprotocol:mainfrom
rcdailey:feat/sse-keepalive-interval

Conversation

@rcdailey
Copy link
Copy Markdown

Summary

Adds an opt-in keepAliveInterval option to WebStandardStreamableHTTPServerTransportOptions. When set, the transport sends periodic SSE comments (: keepalive\n\n) on the standalone GET SSE stream to reset proxy idle timers and prevent silent disconnections.

SSE comments are ignored by spec-compliant clients but keep the connection alive through reverse proxies (nginx, Envoy, HAProxy, cloud load balancers) that enforce idle timeouts.

Usage

const transport = new WebStandardStreamableHTTPServerTransport({
    sessionIdGenerator: () => crypto.randomUUID(),
    keepAliveInterval: 15_000 // send keepalive every 15 seconds
});

Omitting the option preserves existing behavior (no keepalive, fully backwards compatible).

What changed

  • keepAliveInterval?: number added to the options interface (milliseconds, disabled by default)
  • handleGetRequest starts a setInterval that enqueues : keepalive\n\n to the stream controller
  • Timer is cleared on client disconnect, close(), and closeStandaloneSSEStream()
  • Controller enqueue is guarded against closed/errored state
  • The NodeStreamableHTTPServerTransport wrapper picks this up automatically (it type-aliases and passes through the options)
  • Three new tests covering: comments sent when enabled, no comments when disabled, cleanup on close

Related issues

Ref #28 (server-side ping automation, P1)
Ref #876 (SSE connections drop after ~5 min idle behind proxies)

This is complementary to protocol-level ping approaches (like PR #1717). SSE comments operate at the transport layer and don't require JSON-RPC round-trips, making them a lightweight way to keep the connection alive independently of protocol-level health checks.

@rcdailey rcdailey requested a review from a team as a code owner March 21, 2026 18:02
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 21, 2026

🦋 Changeset detected

Latest commit: 5fcb451

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@modelcontextprotocol/server Minor
@modelcontextprotocol/express Major
@modelcontextprotocol/fastify Major
@modelcontextprotocol/hono Major
@modelcontextprotocol/node Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 21, 2026

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/@modelcontextprotocol/client@1726

@modelcontextprotocol/server

npm i https://pkg.pr.new/@modelcontextprotocol/server@1726

@modelcontextprotocol/express

npm i https://pkg.pr.new/@modelcontextprotocol/express@1726

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/@modelcontextprotocol/fastify@1726

@modelcontextprotocol/hono

npm i https://pkg.pr.new/@modelcontextprotocol/hono@1726

@modelcontextprotocol/node

npm i https://pkg.pr.new/@modelcontextprotocol/node@1726

commit: 5fcb451

@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

@felixweinberger felixweinberger force-pushed the feat/sse-keepalive-interval branch from 941226e to 5a925e1 Compare March 31, 2026 10:31
Copy link
Copy Markdown
Contributor

@felixweinberger felixweinberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up - I think this is a good direction.

Couple of inline comments.

@km-anthropic
Copy link
Copy Markdown

@claude review

@rcdailey rcdailey force-pushed the feat/sse-keepalive-interval branch from d29e61f to 6a2fd6f Compare March 31, 2026 23:39
rcdailey added a commit to rcdailey/typescript-sdk that referenced this pull request Mar 31, 2026
The replayEvents() code path (client reconnects with Last-Event-ID) was

missing keepalive timer setup, so reconnecting clients would lose

keepalive protection and get dropped again at the next proxy idle

timeout.

Also fixes the cleanup test to actually prove that close() clears the

timer by asserting vi.getTimerCount() drops to 0, instead of relying

on the catch fallback which would self-clear anyway.

Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

rcdailey added a commit to rcdailey/typescript-sdk that referenced this pull request Apr 1, 2026
The replayEvents() code path (client reconnects with Last-Event-ID) was

missing keepalive timer setup, so reconnecting clients would lose

keepalive protection and get dropped again at the next proxy idle

timeout.

Also fixes the cleanup test to actually prove that close() clears the

timer by asserting vi.getTimerCount() drops to 0, instead of relying

on the catch fallback which would self-clear anyway.

Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
@rcdailey rcdailey force-pushed the feat/sse-keepalive-interval branch from d83942e to 8af2d12 Compare April 1, 2026 12:39
@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

rcdailey added a commit to rcdailey/typescript-sdk that referenced this pull request Apr 2, 2026
The replayEvents() code path (client reconnects with Last-Event-ID) was

missing keepalive timer setup, so reconnecting clients would lose

keepalive protection and get dropped again at the next proxy idle

timeout.

Also fixes the cleanup test to actually prove that close() clears the

timer by asserting vi.getTimerCount() drops to 0, instead of relying

on the catch fallback which would self-clear anyway.

Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
rcdailey added a commit to rcdailey/typescript-sdk that referenced this pull request Apr 2, 2026
Adds defensive _clearKeepAliveTimer() before setInterval in
handleGetRequest() to prevent duplicate timers when the GET stream is
reinitialized.

Moves _clearKeepAliveTimer() inside the if(stream) guard in
closeStandaloneSSEStream() so it only clears when the standalone stream
is being torn down, preventing accidental clearing of a replay stream's
timer.

Closes modelcontextprotocol#1726
@rcdailey rcdailey force-pushed the feat/sse-keepalive-interval branch from 8af2d12 to ccbc773 Compare April 2, 2026 01:09
@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

@rcdailey
Copy link
Copy Markdown
Author

rcdailey commented Apr 2, 2026

@felixweinberger The bot is identifying a real limitation (single shared _keepAliveTimer field with concurrent streams), but it's the same root cause from last round at a different angle. Properly fixing it means per-stream timer ownership, which is a bigger refactor beyond this PR's scope.

Could you take a look at the current state and let me know if this is mergeable, or if there's anything else you'd like changed?

@felixweinberger
Copy link
Copy Markdown
Contributor

@felixweinberger The bot is identifying a real limitation (single shared _keepAliveTimer field with concurrent streams), but it's the same root cause from last round at a different angle. Properly fixing it means per-stream timer ownership, which is a bigger refactor beyond this PR's scope.

Could you take a look at the current state and let me know if this is mergeable, or if there's anything else you'd like changed?

Yes will take a look thanks

rcdailey added 4 commits April 2, 2026 08:12
Adds an opt-in keepAliveInterval option to
WebStandardStreamableHTTPServerTransportOptions that sends periodic
SSE comments (`: keepalive`) on the standalone GET SSE stream.

Reverse proxies commonly close connections that are idle for 30-60s.
With no server-initiated messages, the GET SSE stream has no traffic
during quiet periods, causing silent disconnections. This option lets
operators send harmless SSE comments at a configurable cadence to
keep the connection alive.

The timer is cleared on close(), closeStandaloneSSEStream(), and on
stream cancellation. Disabled by default; no behavior change for
existing deployments.

Addresses upstream modelcontextprotocol#28, modelcontextprotocol#876.
The replayEvents() code path (client reconnects with Last-Event-ID) was

missing keepalive timer setup, so reconnecting clients would lose

keepalive protection and get dropped again at the next proxy idle

timeout.

Also fixes the cleanup test to actually prove that close() clears the

timer by asserting vi.getTimerCount() drops to 0, instead of relying

on the catch fallback which would self-clear anyway.

Addresses PR review feedback from @felixweinberger on PR modelcontextprotocol#1726.
Prevent timer handle leaks when concurrent reconnect requests bypass
the conflict check. This edge case is possible when EventStore omits the
optional getStreamIdForEventId method, allowing duplicate replay
attempts to start new timers without cleaning up existing ones.
Adds defensive _clearKeepAliveTimer() before setInterval in
handleGetRequest() to prevent duplicate timers when the GET stream is
reinitialized.

Moves _clearKeepAliveTimer() inside the if(stream) guard in
closeStandaloneSSEStream() so it only clears when the standalone stream
is being torn down, preventing accidental clearing of a replay stream's
timer.

Closes modelcontextprotocol#1726
@rcdailey rcdailey force-pushed the feat/sse-keepalive-interval branch from ccbc773 to 1a750b9 Compare April 2, 2026 13:13
@rcdailey
Copy link
Copy Markdown
Author

rcdailey commented Apr 2, 2026

Thanks Felix. My latest push is a rebase to resolve conflicts with main; no additional changes were made.

@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

@felixweinberger
Copy link
Copy Markdown
Contributor

I think the bot comment is real, pushed a potential fix with an additional regression test

Copy link
Copy Markdown

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — the per-stream keepAliveTimer ownership in StreamMapping correctly resolves the cross-stream timer interference I raised last round, and the new concurrent-stream regression test validates it.

Extended reasoning...

Overview

This PR adds an opt-in keepAliveInterval option to WebStandardStreamableHTTPServerTransport that sends periodic SSE comments to prevent reverse-proxy idle-timeout disconnections. Three files are modified: the changeset descriptor, streamableHttp.ts, and its test file.

Security risks

No security-sensitive code is touched. The feature adds outbound SSE comment writes guarded by a try/catch; there is no new input surface, no auth changes, and no data exposure risk.

Level of scrutiny

This is a new opt-in transport-layer feature that required several rounds of iteration. The logic is now self-contained in handleGetRequest and replayEvents, and the final design (per-stream ownership via StreamMapping.keepAliveTimer) is the correct approach for isolating concurrent stream lifetimes.

Other factors

After four rounds of review and fixes, the substantive bugs have all been addressed: replay-path keepalives were added, the closeStandaloneSSEStream guard was corrected, and the shared-timer cross-cancellation hazard was eliminated by moving to per-stream ownership. A dedicated regression test (should maintain independent keepalive timers per concurrent stream) directly exercises the concurrent scenario. The only open item is a JSDoc documentation nit (inline comment already filed) with no runtime impact. The feature is backwards-compatible and fully opt-in.

Comment on lines +146 to +153
/**
* Interval in milliseconds for sending SSE keepalive comments on the standalone
* GET SSE stream. When set, the transport sends periodic SSE comments
* (`: keepalive`) to prevent reverse proxies from closing idle connections.
*
* Disabled by default (no keepalive comments are sent).
*/
keepAliveInterval?: number;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The JSDoc for keepAliveInterval (line 147) says it sends keepalive comments on the standalone GET SSE stream, but the implementation also starts the timer for reconnecting clients via the Last-Event-ID / replayEvents() path. The description should be updated to reflect that the option applies to all GET-initiated SSE streams, not just the initial standalone one.

Extended reasoning...

What the inaccuracy is

The JSDoc for keepAliveInterval reads: "Interval in milliseconds for sending SSE keepalive comments on the standalone GET SSE stream." This is factually incomplete. The same PR also added keepalive timer support inside replayEvents() (the Last-Event-ID reconnect path), so keepAliveInterval governs keepalive behaviour on both GET stream types.

The specific code evidence

In handleGetRequest() (around line 500 of the diff), when this._keepAliveInterval !== undefined, a setInterval is stored in mapping.keepAliveTimer. In replayEvents() (around line 600), an identical block assigns a local keepAliveTimer and stores it in mapping.keepAliveTimer. Both paths read the same this._keepAliveInterval field. The changeset file .changeset/add-sse-keepalive.md repeats the same partial description: "sends periodic SSE comments on the standalone GET stream".

Why existing code does not prevent confusion

A developer reading only the JSDoc would conclude that reconnecting clients (those sending Last-Event-ID) do not benefit from the keepalive option, and might add their own keepalive logic elsewhere, leading to duplicate timers or misfiled bugs.

Impact

There is no runtime impact -- the code works correctly for both stream types and users actually get more coverage than advertised. The issue is purely documentation: a developer relying on the JSDoc would have an incomplete mental model of the feature scope.

Concrete proof

  1. Developer reads the JSDoc: "sends SSE keepalive comments on the standalone GET SSE stream."
  2. Developer sets keepAliveInterval: 15_000 intending to keep all GET-based connections alive.
  3. A client reconnects with Last-Event-ID -> enters replayEvents() -> timer is started there too.
  4. Developer is unaware replay streams also have keepalives, potentially adding redundant logic or assuming they are missing.

Addressing the refutation

One verifier argued this is not worth reporting because the behavior is more generous than advertised. While true that no user suffers broken behavior, a JSDoc that describes only half of a new feature's scope is a genuine documentation bug. The fix is a one-line word change and the cost of the inaccuracy (misleading developers) outweighs the cost of correcting it.

How to fix

Change "on the standalone GET SSE stream" to "on all GET-initiated SSE streams (both the initial standalone stream and reconnecting Last-Event-ID streams)". Update the changeset description similarly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants