Skip to content

Implement Lambda runtime init error reporting#103

Open
joe4dev wants to merge 25 commits into
localstackfrom
devx-1-implement-lambda-runtime-init-error-reporting
Open

Implement Lambda runtime init error reporting#103
joe4dev wants to merge 25 commits into
localstackfrom
devx-1-implement-lambda-runtime-init-error-reporting

Conversation

@joe4dev

@joe4dev joe4dev commented Jun 3, 2026

Copy link
Copy Markdown
Member

Summary

When a Lambda runtime exits unexpectedly or fails during initialization, LocalStack previously received no structured callback and waited until the environment timeout, surfacing a generic error. This PR makes the RIE detect init failures and the init-phase timeout and report them to LocalStack the way AWS does:

  • On-demand functions fold a failed cold-start init into the first invocation (AWS "suppressed init"): the init failure is not reported separately; instead the function signals ready and the first invoke surfaces the error with the full INIT_REPORT/START/END/REPORT log envelope.
  • Provisioned concurrency / Managed Instances report init failures at provisioning time via /status/error, failing the provisioning operation.

All synthetic log lines (START, INIT_REPORT, and the REPORT Init Duration / Status / Error Type fields) are rendered from rapidcore's native lifecycle events, so they land at AWS-faithful points (e.g. after an inline suppressed init's own logs) and carry rapid's authoritative durations.

This is the RIE side; the integration tests and the extended-init timeout wiring live in localstack/localstack-pro#7293.

Design

The implementation is deliberately confined to the LocalStack fork's own package cmd/localstack/no files under internal/ are modified, so it never conflicts when rebasing onto upstream aws-lambda-runtime-interface-emulator. It hooks two upstream extension points, using their exported APIs only:

  • EventsAPILocalStackEventsAPI embeds the upstream NoOpEventsAPI (so no platform event is retained in memory) and rides rapid's per-init / per-invoke lifecycle callbacks (SendInitStartSendInitRuntimeDoneSendInitReport, and SendInvokeStart) to render the AWS log lines and record the init outcome.
  • rapidcore.Server — driven only through AwaitInitialized(), Reset(), and GetCurrentInvokeID(); the init-phase timeout is a bounded await in main.go.

Changes

Detailed changes

cmd/localstack/events.go — new (LocalStackEventsAPI)

Embeds NoOpEventsAPI (the upstream sandbox default) and rides rapid's lifecycle events:

  • SendInvokeStart emits the synthetic START RequestId: ... Version: ... line. rapid fires this after any inline (suppressed) init and before the runtime handles the invocation, so a re-run init's logs correctly land before START, matching AWS — no custom sequencing needed.
  • SendInitStart / SendInitRuntimeDone / SendInitReport track each init attempt's status and scrubbed fatal error type. SendInitReport renders the INIT_REPORT line for failed or timed-out inits (with a Status and, for failures, an Error Type), and a bare INIT_REPORT line carrying only the duration for successful provisioned-concurrency / Managed Instances inits (those environments init ahead of time, so the duration cannot fold into a first invocation). It uses rapid's authoritative duration and phase (init for the eager cold-start init, invoke for a suppressed init folded into an invocation). A successful on-demand cold-start emits no INIT_REPORT; it instead buffers its duration for the first invocation's REPORT Init Duration.
  • An init that died before the runtime was started (empty RuntimeDone status — e.g. an extension/bootstrap failure) is treated as an error and falls back to the Runtime.ExitError type.

State accessors used by the invoke handler and main.go:

  • SetInitPhaseTimedOut() — marks the in-flight init as timed out so its INIT_REPORT renders Status: timeout (not the generic reset error); also discards a cold-start duration recorded by an init that completed concurrently with the timeout decision.
  • TakeColdStartInitDuration() — returns rapid's measured cold-start Init duration at most once (it belongs to the first invocation's REPORT only); false on warm starts, after failed/timed-out inits, and for non-on-demand environments.
  • InitErrorType() — the scrubbed fatal error type of the most recent failed init attempt. It is reset per attempt on SendInitStart: no cross-attempt stickiness is needed because every invocation into a failed-init environment starts a fresh suppressed Init phase (rapidcore shuts the runtime down after an init failure, so the next FastInvoke re-inits), and each failing attempt re-records its type via its own SendInitReport. A successful re-run leaves it empty, so a recovered environment (transient init failure) is not permanently tainted, and a fatal invocation after a recovered init is no longer mislabeled with the original init error.

cmd/localstack/main.go — modified

  • Creates LocalStackEventsAPI upfront and passes it into NewCustomInteropServer (which wires it into the sandbox via SetEventsAPI and constructs the LocalStackAdapter internally).
  • Classifies onDemand from AWS_LAMBDA_INITIALIZATION_TYPE (default on-demand). SnapStart environments are intentionally classified on-demand: LocalStack initializes them lazily at the first invoke (not at version publish), so the fold-into-invoke model applies to them too.
  • Bounded init await: runs delegate.AwaitInitialized() on a goroutine and selects it against a timer set to LOCALSTACK_INIT_PHASE_TIMEOUT (default 10 s, validated > 0, and unset from the function's environment via UnsetLsEnvs for AWS parity).
  • On timeout:
    • On-demand: calls SetInitPhaseTimedOut(), Resets the in-progress init so rapidcore re-runs a fresh Init phase on the first invoke (suppressed init), then waits for the awaiting goroutine to drain the aborted init's failure notification before signaling ready. This drain is load-bearing twice over: it stops rapidcore from caching a generic Sandbox.Failure placeholder (empty payload) that would mask the real error if the suppressed re-run crashes without /init/error, and it orders the goroutine's cleanup (Server.Release) before the ready signal so it cannot cancel the first invoke's fresh reservation.
    • PC / Managed Instances: there is no suppressed-init retry — calls ReportInitFailure(Sandbox.Timeout, …) and exits, failing provisioning.
  • On init failure: on-demand + ErrInitDoneFailed → signal ready and keep the process alive so the first invoke surfaces the cached error (the events API has already rendered the INIT_REPORT(phase=init) line and recorded the type); ErrInitResetReceived (external reset, e.g. hot reload) → exit silently; otherwise (PC/MI) → ReportInitFailure with the recorded error type (fallback Runtime.ExitError) and exit.

cmd/localstack/custom_interop.go — modified

  • New CustomInteropServer fields: eventsAPI and initErrorPayload atomic.Value (stashes the runtime's /init/error payload so it can be forwarded verbatim later).
  • NewCustomInteropServer now takes the shared *LocalStackEventsAPI and constructs the *LocalStackAdapter internally (it is reached elsewhere through interopServer.localStackAdapter).
  • LocalStackAdapter gets a single post() helper that closes the response body and fails on non-2xx (e.g. LocalStack rejects a duplicate /status/error with 400, previously treated as success). SendStatus / SendLogs / SendResult route through it — this also fixes connection leaks from previously-unclosed bodies.
  • Invoke handler (REPORT / error semantics):
    • START is no longer written eagerly here (now emitted from the events API at rapid's invoke-start point).
    • Init Duration is taken from TakeColdStartInitDuration() — only the first invocation into a successfully initialized on-demand environment carries it, using rapid's measured duration.
    • On ErrInvokeTimeout: ErrorType: "Sandbox.Timedout", message "RequestId: <id> Error: Task timed out after N.00 seconds", and REPORT Status: timeout.
    • On ErrInvokeDoneFailed with a recorded init error type (InitErrorType()): REPORT gains Status: error Error Type: <scrubbed type>. The error type is derived per-attempt, so a recovered environment is not tainted by an earlier init failure.
  • SendInitErrorResponse now stashes the runtime's payload and delegates (returning the delegate's error, which the /runtime/init/error handler uses to render an interop error to the runtime, e.g. ErrResponseSent during a suppressed init) — it no longer POSTs to LocalStack directly.
  • ReportInitFailure(errType, message) — the single /status/error report site, called only from main.go for provisioning-time failures. Forwards the stashed runtime payload when present, otherwise synthesizes "RequestId: <id> Error: <msg>".
  • adaptInitErrorPayload — decodes the runtime's /init/error payload into a map[string]any (not a typed struct) and injects requestId, preserving all other fields verbatim — in particular an empty-but-present "stackTrace": [] (e.g. Runtime.HandlerNotFound) that omitempty would drop on re-marshal. (AWS includes a blank requestId in runtime-reported init-error payloads but not in platform-synthesized ones — the two ReportInitFailure paths deliberately differ here; documented on lsapi.ErrorResponse.)

cmd/localstack/awsutil.go — modified

PrintEndReports gains structured parameters — initDurationMS/hasInitDuration, status, and errorType — and renders the REPORT line (including Init Duration placed after Max Memory Used, AWS field order, and the Status / Error Type fields) in one place, instead of receiving pre-formatted string fragments. This keeps the REPORT-line format out of custom_interop.go and means a runtime-supplied error type containing a % verb cannot corrupt the line.

cmd/localstack/logs.go — modified

Documents why captured runtime output is emitted verbatim: AWS keeps a bare CR inside a single CloudWatch log event (records split on LF only), and LocalStack's log ingestion likewise splits on "\n", so rewriting CR→LF would wrongly split a record such as print("a\rb") into two events (see TestCloudwatchLogs::test_multi_line_prints).

internal/lsapi/types.go — modified

Documents the deliberate dual shape of init-error payloads on lsapi.ErrorResponse: runtime-reported payloads carry a blank requestId (and possibly an empty stackTrace) while platform-synthesized payloads carry neither — both validated by localstack-pro's AWS snapshots — so the asymmetry between the two ReportInitFailure paths is not mistaken for a bug.

cmd/localstack/events_test.go — new

Unit tests for the events API's rendering matrix (the START/INIT_REPORT flavors: successful on-demand cold-start, successful provisioned-concurrency init, init failure, init timeout, and suppressed init).

README-LOCALSTACK.md — modified

Documents internal/lsapi as a LocalStack-only package in the fork's custom-changes list.

Tests

Behavior is validated by the AWS-recorded integration tests in localstack/localstack-pro#7293:

Scenario Test
Exception raised during module import test_lambda_runtime_error
sys.exit() called during init test_lambda_runtime_exit
Unbounded recursion → RecursionError segfault during init test_lambda_runtime_exit_segfault
Handler function does not exist in module test_lambda_handler_not_found
Missing AWS_LAMBDA_EXEC_WRAPPER script test_lambda_runtime_wrapper_not_found
Init phase exceeds 10 s → transparent suppressed-init retry under the function timeout test_lambda_timeout_init_phase
Init phase exceeds 10 s → retried init crashes without /init/error test_lambda_init_timeout_then_crash
Successful provisioned-concurrency init/invoke logging test_provisioned_concurrency_logging
Init failure of a provisioned-concurrency environment test_provisioned_concurrency_init_failure

Go unit tests: go test ./cmd/localstack/.

Related

Depends on #101
Closes DEVX-1

🤖 Generated with Claude Code

Base automatically changed from localstack-api-compat-test to localstack June 9, 2026 07:13
@joe4dev joe4dev force-pushed the devx-1-implement-lambda-runtime-init-error-reporting branch from 348ee93 to ff646e5 Compare June 9, 2026 07:34
@joe4dev joe4dev changed the base branch from localstack to main June 9, 2026 07:37
@joe4dev joe4dev changed the base branch from main to localstack June 9, 2026 07:37
@joe4dev joe4dev force-pushed the devx-1-implement-lambda-runtime-init-error-reporting branch 3 times, most recently from 8137f8b to c7f0e6f Compare June 9, 2026 12:19
joe4dev and others added 13 commits June 11, 2026 09:17
…s API

Ports the supervisor and events API from PR #41 to enable proper error
reporting when a Lambda runtime process exits unexpectedly (e.g. sys.exit()
or missing wrapper script), instead of LocalStack timing out with a generic
error.

- Add LocalStackSupervisor: wraps ProcessSupervisor, detects unexpected
  runtime-* process exits and emits SendFault(RuntimeExit) events
- Add LocalStackEventsAPI: wraps StandaloneEventsAPI, overrides SendFault
  to forward errors to LocalStack via SendStatus(error, ...)
- Wire both into SandboxBuilder via SetEventsAPI / SetSupervisor
- Refactor NewCustomInteropServer to accept a pre-created *LocalStackAdapter
  shared with the events API
- Improve SendInitErrorResponse: properly deserialises the payload, includes
  RequestId, and sends asynchronously (non-blocking)

Enables test_lambda_runtime_exit and test_lambda_runtime_wrapper_not_found.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use *string for the RequestId field in ErrorResponse so that an empty
string is serialized (not omitted by omitempty), while nil — used for
fault events — stays omitted. Fixes test_lambda_runtime_error snapshot
mismatch where requestId: "" was expected but absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move Init Duration after Max Memory Used in REPORT line (matches AWS)
- Add Status: timeout to REPORT line on invoke timeout
- Fix timeout error message format to "RequestId: <id> Error: Task timed out after N.00 seconds"
- Add ErrorType: "Sandbox.Timedout" to timeout error response
- Track init start time and emit Init Duration on first non-retry invocation
- Add is-init-retry field to InvokeRequest to suppress Init Duration on retry invokes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Close response bodies in SendStatus/SendLogs/SendResult so idle
  connections are released instead of leaked.
- Use errors.New instead of fmt.Errorf with no format arguments.
- Document the single-invoke assumption behind the unsynchronized
  initStart/warmStart fields.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolve the fault request ID in the events API: prefer an explicit ID,
then the current invoke ID so a mid-invocation runtime crash reports the
actual request, and only synthesize a UUID as a fallback for init-phase
faults where no invocation has been dispatched yet. Previously the
supervisor always passed a random UUID, masking the real invoke ID.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pidcore

Move the Lambda init-phase timeout retry into the RIE and replace the custom
supervisor with rapidcore's existing init-failure machinery.

- rapidcore: add AwaitInitializedWithDetails (structured init outcome) and
  AwaitInitializedWithTimeout (timer-aware; does NOT consume the init-failures
  channel on timeout, so the invoke path's Reserve() can still drive suppressed
  init). Refactor awaitInitialized into interpretInitFailure.
- main.go: on init-phase timeout (LOCALSTACK_INIT_PHASE_TIMEOUT, default 10s),
  emit INIT_REPORT, signal ready, and reset the in-progress init so the first
  invoke re-runs it (suppressed init) under the function timeout. Genuine init
  failures are reported via SendInitError.
- custom_interop.go: SendInitError crash-path fallback with an initErrorForwarded
  dedup guard, formatted as AWS's "RequestId: <id> Error: <msg>"; ReportInitTimeout
  + initTimedOut-driven Init Duration suppression.
- Remove the custom LocalStackSupervisor/LocalStackEventsAPI; drop unused
  IsInitRetry from lsapi.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The synthetic START line was written eagerly when LocalStack dispatched /invoke,
i.e. before the suppressed init re-runs the function's static code. AWS emits
START upon the Invoke event reaching the runtime, which rapidcore sequences after
any inline (suppressed) init (doInvoke -> sendInvokeStartLogEvent).

Emit START from a minimal LocalStackEventsAPI.SendInvokeStart override (riding
rapidcore's correctly-placed invoke-start event) and drop the eager write in the
/invoke handler. Correct for warm, cold, and suppressed-init invocations; fixes
test_lambda_timeout_init_phase against the unmodified AWS snapshot.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t it

Follow the established LocalStack-env strategy: capture the value in InitLsOpts
(before UnsetLsEnvs runs) and add it to the UnsetLsEnvs list. Previously it was
read inline with os.Getenv after UnsetLsEnvs, so the variable was never unset and
leaked into the function's environment (forwarded via os.Environ in InitHandler),
breaking AWS parity.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ckTrace)

SendInitErrorResponse round-tripped the runtime's /init/error payload through a
typed struct whose stackTrace used omitempty, dropping an empty-but-present
"stackTrace": [] (as AWS emits for Runtime.HandlerNotFound). Decode into a map and
only inject requestId, forwarding the runtime's fields verbatim.

Fixes test_lambda_handler_not_found.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AWS folds a failed cold-start init into the first invocation (suppressed init),
reporting it as a failed invoke with the full INIT_REPORT(phase=invoke)/START/END/
REPORT envelope rather than a separate init error. Match this for on-demand:

- main.go: on init failure for on-demand, signal ready and keep the process alive
  instead of SendInitError+exit, so the first invoke surfaces the cached init error.
- custom_interop: skip /status/error forwarding for on-demand (cache only, so the
  invoke carries the error); emit INIT_REPORT Phase:invoke Status:error Error Type
  before START and add Status/Error Type to the REPORT, using rapidcore's scrubbed
  fatal error type. PC/SnapStart/Managed Instances keep the provisioning-time model.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The runtime emits multi-line records (e.g. an unhandled-init traceback) as a single
log frame with internal newlines replaced by bare carriage returns. AWS renders
these back as line feeds, so convert bare CR to LF in the assembled log output while
preserving genuine CRLF endings (which AWS keeps, e.g. the LAMBDA_WARNING line).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…elper to its own file

The init-phase timeout / suppressed-init support previously edited the vendored
upstream internal/lambda/rapidcore/server.go (exported InitCompletionResponse,
extracted interpretInitFailure, added AwaitInitializedWithDetails and
AwaitInitializedWithTimeout). Modifying vendored upstream files causes rebase
conflicts when syncing with aws-lambda-runtime-interface-emulator.

Revert server.go to byte-identical upstream and move the only load-bearing
addition (AwaitInitializedWithTimeout, plus a local InitCompletionResponse and a
duplicated interpretInitFailure) into a new same-package file
server_localstack.go. It must stay in package rapidcore because it needs the
unexported getInitFailuresChan()/Release()/setRuntimeState() helpers, but as a
standalone file it never conflicts on rebase. AwaitInitializedWithDetails was
unused by cmd/localstack and is dropped.

No behavior change: RIE go test ./... and TestLambdaErrors both green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Commit 9b54e66 rewrote every bare carriage return in the captured runtime output to
a line feed, intending to render multi-line init tracebacks across lines. But AWS
keeps a bare CR inside a single CloudWatch log event (records split on LF only), and
LocalStack's log ingestion likewise splits on "\n". The conversion therefore wrongly
split any record containing a bare CR into multiple events, breaking the AWS-validated
TestCloudwatchLogs::test_multi_line_prints (a user `print("a\rb")` was emitted as two
events "a" and "b" instead of one event "a\rb").

Emit the runtime output verbatim. Verified: test_multi_line_prints and the full
TestLambdaErrors suite (incl. the runtime-exit/segfault error-reporting tests) are
both green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joe4dev joe4dev force-pushed the devx-1-implement-lambda-runtime-init-error-reporting branch from 6bcb7e2 to 44fe9ff Compare June 11, 2026 07:17
@joe4dev joe4dev added the trigger:rc-release Trigger on-demand RC release for testing label Jun 11, 2026
@github-actions

Copy link
Copy Markdown

🧪 RC pre-release ready: v0.0.0-rc.pr103-44fe9ff

Test this PR against localstack-pro CI by setting:

LAMBDA_INIT_RELEASE_VERSION=v0.0.0-rc.pr103-44fe9ff

Assets:

Built from 44fe9ffe85f15c3950d7a52ee12ab600e32a8587. integ-tests skipped in RC builds. This pre-release is auto-deleted when the PR is closed.

joe4dev and others added 3 commits June 11, 2026 15:12
…n extended-init timeout

Two fixes to the init-timeout case in the init-await switch:

1. The reset of a timed-out init now runs synchronously BEFORE signaling
   ready. Reset's cleanup (Clear/Release in rapidcore.Server.Reset) releases
   the current reservation, so running it concurrently with the first
   invoke's Reserve() raced that invoke's reservation and could cancel it
   mid-flight, returning an empty result while the suppressed init was
   still running. The reset cannot deadlock on the unconsumed init failure:
   awaitInitCompletion acks rapid before the (still pending) initFailures
   channel send, which the invoke path consumes later.

2. The suppressed-init retry model only applies to on-demand functions.
   Provisioned concurrency / Managed Instances environments that exceed
   their extended init window now fail provisioning via /status/error
   (AWS fails the provisioning operation) instead of signaling ready and
   inevitably re-running the long init into the shorter invoke timeout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… rapid's init duration

Reporting fixes for the on-demand folded-into-invoke (suppressed init) path:

- The invoke handler no longer unconditionally forces error status from the
  recorded init failure: the REPORT Status/Error Type lines and the /error
  routing now apply only when the invocation actually failed, and the
  recorded failure is cleared once an invocation succeeds. Previously one
  init failure permanently tainted a recovered environment - every later
  (successful) invocation was posted to /error with an error REPORT line.
  Persistent failures keep re-emitting the full failure envelope per invoke.

- RecordInitError captures the failure type detected by rapidcore for
  runtimes that crashed WITHOUT calling /init/error (sys.exit, segfault,
  invalid entrypoint), so those failures render the same
  INIT_REPORT(phase=invoke) and REPORT Status/Error Type lines as the
  /init/error-reported flavor instead of a healthy-looking envelope.

- REPORT's Init Duration now uses rapid's authoritative Init-phase
  measurement captured from the INIT_REPORT(phase=init) lifecycle event,
  instead of wall-clock time.Since(initStart) at invoke arrival, which
  wrongly included the idle gap between init completion and the first
  invoke (minutes for provisioned concurrency). It is also omitted for
  non-on-demand invokes, matching AWS, which reports provisioned
  concurrency init in separate provisioning-time log streams.

- Document that SnapStart environments intentionally use the on-demand
  error model in LocalStack (they initialize lazily at the first invoke,
  not at version publish).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… the suppressed init's real error

After the timeout path resets the in-progress init, awaitInitCompletion is
parked sending a ResetReceived failure on the unbuffered initFailures
channel. The first invoke's Reserve()/awaitInitialized() consumed it and
cached a generic placeholder error (Sandbox.Failure with an EMPTY payload,
see the ErrInitResetReceived handling in Invoke), which took precedence over
the real failure when the suppressed init re-run crashed without calling
/init/error: the invocation returned an empty error payload instead of e.g.
Runtime.ExitError "Runtime exited with error: exit status 1" (AWS-validated
by test_lambda_init_timeout_then_crash in localstack-pro).

Drain the notification right after the synchronous reset, so the invoke's
awaitInitialized() observes the closed channel, treats the init outcome as
pending, and the suppressed init's own result stays authoritative. This also
unparks the awaitInitCompletion goroutine for environments that time out
their init but never receive an invoke.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joe4dev joe4dev added trigger:rc-release Trigger on-demand RC release for testing and removed trigger:rc-release Trigger on-demand RC release for testing labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown

🧪 RC pre-release ready: v0.0.0-rc.pr103-e7516c4

Test this PR against localstack-pro CI by setting:

LAMBDA_INIT_RELEASE_VERSION=v0.0.0-rc.pr103-e7516c4

Assets:

Built from e7516c41917d3fc19b61c93a031732fc0582e514. integ-tests skipped in RC builds. This pre-release is auto-deleted when the PR is closed.

…cumentation

- REPORT line: pass the pre-formatted Init Duration/Status fragments as %s
  arguments instead of concatenating them into the Fprintf format string,
  where a runtime-supplied error type containing a formatting verb would
  corrupt the line (the fatal-error scrub regex is unanchored).
- SendInitErrorResponse returns the delegate's error again (pre-branch
  behavior): the /runtime/init/error handler renders an interop error to
  the runtime based on it (e.g. ErrResponseSent during a suppressed init).
- LocalStack adapter POSTs now fail on non-2xx responses (e.g. LocalStack
  rejects a duplicate /status/error with 400, previously treated as
  success) and share one post() helper.
- LOCALSTACK_INIT_PHASE_TIMEOUT is validated (> 0; a 0 previously forced
  every init down the timeout path) and parsed once instead of
  double-defaulting.
- SendInitError fault messages use a blank requestId, matching the
  /init/error path which forwards AWS's blank init-phase requestId; the
  unused uuid dependency is dropped and lsapi.ErrorResponse.RequestId
  reverts to a plain string (never set anywhere), restoring
  internal/lsapi/types.go to its pre-branch state.
- Dedupe the INIT_REPORT line format and the ns->ms conversion.
- Document fork changes per README-LOCALSTACK.md convention: LOCALSTACK
  CHANGES prefix in server_localstack.go and new entries for it and
  internal/lsapi in the custom-changes list.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joe4dev joe4dev added trigger:rc-release Trigger on-demand RC release for testing and removed trigger:rc-release Trigger on-demand RC release for testing labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown

🧪 RC pre-release ready: v0.0.0-rc.pr103-867829a

Test this PR against localstack-pro CI by setting:

LAMBDA_INIT_RELEASE_VERSION=v0.0.0-rc.pr103-867829a

Assets:

Built from 867829ae788c1d76c3a68c95d9b893898e232ab9. integ-tests skipped in RC builds. This pre-release is auto-deleted when the PR is closed.

joe4dev and others added 2 commits June 11, 2026 19:33
…rt init

AWS performs a suppressed double init when an on-demand function's
cold-start init fails: it emits INIT_REPORT(phase=init, status=error)
for the failed cold-start init and INIT_REPORT(phase=invoke, status=error)
for the retried init folded into the first invocation. The RIE only
emitted the phase=invoke line.

Add ReportInitPhaseError() to emit the missing phase=init line and call
it on the on-demand init-failure path (onDemand && ErrInitDoneFailed),
mirroring the existing ReportInitTimeout() / phase=invoke emission. This
makes test_lambda_runtime_exit_segfault pass against LocalStack.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…drop rapidcore additions

Simplify the init error-reporting and init-timeout implementation with no
behavior change (TestLambdaErrors 12/12, timeout/PC/log-envelope tests, and
full test_lambda.py green against the rebuilt RIE):

* Delete internal/lambda/rapidcore/server_localstack.go: the timeout-bounded
  await is now a goroutine running the exported AwaitInitialized() plus a
  select in main.go. On timeout, receiving the goroutine's result after the
  reset doubles as the init-failure drain and orders its cleanup before the
  ready signal, so it can never cancel the first invoke's fresh reservation.
  No LocalStack-owned code remains under internal/.

* Render all INIT_REPORT flavors (cold-start failure, timeout, suppressed
  init) in LocalStackEventsAPI from rapid's native InitStart/InitRuntimeDone/
  InitReport events instead of three hand-rolled call sites. The suppressed
  init's Phase: invoke line now reports that init's measured duration after
  its own logs, matching AWS more closely.

* Record the scrubbed fatal error type natively from SendInitRuntimeDone
  (which fires even for crashes that never call /init/error), replacing the
  initErrorType mirroring; the cold-start Init Duration becomes take-once
  state, replacing warmStart/initTimedOut; wall-clock fallbacks are dropped
  because rapid's InitReport event always fires.

* Collapse /status/error reporting into a single post site
  (ReportInitFailure: forward the stashed /init/error payload, synthesize
  otherwise), removing the initErrorForwarded duplicate-send guard.

* Add unit tests for the events API rendering matrix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@joe4dev joe4dev added trigger:rc-release Trigger on-demand RC release for testing and removed trigger:rc-release Trigger on-demand RC release for testing labels Jun 11, 2026
@github-actions

Copy link
Copy Markdown

🧪 RC pre-release ready: v0.0.0-rc.pr103-fde9640

Test this PR against localstack-pro CI by setting:

LAMBDA_INIT_RELEASE_VERSION=v0.0.0-rc.pr103-fde9640

Assets:

Built from fde9640454a6e5c6f0ac9aa8528eff3bb4200f0b. integ-tests skipped in RC builds. This pre-release is auto-deleted when the PR is closed.

joe4dev and others added 6 commits June 12, 2026 09:08
…bed)

LocalStackEventsAPI embedded StandaloneEventsAPI, which appends every
platform event (START, runtimeDone, end, report, ...) to an in-memory
event log whose only drain, FetchTailLogs, is never called in this
deployment — i.e. unbounded memory growth in warm environments,
measured at ~2.2 KB RSS per warm invoke (+41 MB over 20k invokes).

Embed NoOpEventsAPI instead (the upstream sandbox default before this
branch): same interop.EventsAPI surface, nothing retained. The
overridden lifecycle hooks keep their LocalStack-specific behavior and
simply no longer forward to the standalone event log.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tError

The recorded init error type was sticky across init attempts and cleared
cross-component by the invoke handler after a successful invocation.
Stickiness is unnecessary: every invocation into a failed-init
environment starts a fresh suppressed Init phase (rapidcore shuts the
runtime down after an init failure so the next FastInvoke re-inits), so
each failing attempt re-records its error type via its own
SendInitReport. Resetting the field in SendInitStart therefore keeps the
same behavior with one mechanism instead of two.

This also fixes a mislabeling bug: previously, when a suppressed init
re-run recovered but the invocation itself then died fatally
(ErrInvokeDoneFailed), the stale sticky error type tainted that
invocation's REPORT line with the original init failure.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…erver

The adapter was hoisted into main.go "so it can be shared with the
interop server", but nothing in main.go uses it after the constructor
call — main reaches the adapter through interopServer.localStackAdapter.
Revert to in-constructor creation and drop the extra parameter. Also fix
a stale comment in events.go referencing a CustomInteropServer.onDemand
field that does not exist.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Runtime-reported error payloads include a blank "requestId" (and
possibly an empty "stackTrace") on AWS, while platform-synthesized
payloads (Sandbox.Timedout, Runtime.ExitError) carry neither — both
validated by localstack-pro's AWS snapshots. Spell this out on
lsapi.ErrorResponse and adaptInitErrorPayload so the asymmetry between
the two ReportInitFailure paths is not mistaken for a bug and unified.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
PrintEndReports took pre-formatted "Init Duration: ..." and
"Status: ...\tError Type: ..." string fragments, splitting knowledge of
the REPORT line format between custom_interop.go and awsutil.go. Pass
the structured values (duration, status, error type) instead and render
the line in PrintEndReports only. Output is byte-identical.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
On-demand cold-start inits fold their duration into the first invocation's
REPORT line, but provisioned-concurrency / Managed Instances environments
initialize ahead of time and their invokes omit "Init Duration". AWS instead
emits a standalone INIT_REPORT line in the invoke-serving log stream carrying
only the duration -- no Phase, Status, or Error Type (those appear only on
failed or timed-out init lines).

SendInitReport previously emitted nothing for these successful non-on-demand
inits, so LocalStack's log stream diverged from AWS. Add the bare INIT_REPORT
line for that case and update the unit test accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@joe4dev joe4dev added trigger:rc-release Trigger on-demand RC release for testing and removed trigger:rc-release Trigger on-demand RC release for testing labels Jun 12, 2026
@github-actions

Copy link
Copy Markdown

🧪 RC pre-release ready: v0.0.0-rc.pr103-5626043

Test this PR against localstack-pro CI by setting:

LAMBDA_INIT_RELEASE_VERSION=v0.0.0-rc.pr103-5626043

Assets:

Built from 5626043bc606073e8508307296323b12c8d23fb2. integ-tests skipped in RC builds. This pre-release is auto-deleted when the PR is closed.

@joe4dev joe4dev marked this pull request as ready for review June 12, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

trigger:rc-release Trigger on-demand RC release for testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant