Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 24 additions & 6 deletions src/lib/api/sourcemaps.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@
import { tmpdir } from "node:os";
import { join } from "node:path";
import { promisify } from "node:util";
import { gzip as gzipCb } from "node:zlib";
// biome-ignore lint/performance/noNamespaceImport: needed for feature-detected zstd access
import * as zlib from "node:zlib";
import pLimit from "p-limit";
import { z } from "zod";
import { ApiError } from "../errors.js";
Expand All @@ -34,7 +35,22 @@
import { type ZipCompression, ZipWriter } from "../sourcemap/zip.js";
import { apiRequestToRegion } from "./infrastructure.js";

const gzipAsync = promisify(gzipCb);
const gzipAsync = promisify(zlib.gzip);
// zstdCompress is available in Node 22.15+. Feature-detect to avoid crashing
// the npm bundle on older Node versions (e.g., CI runners with Node 20).
// zstdCompress is available in Node 22.15+. Feature-detect to avoid crashing
// the npm bundle on older Node versions (e.g., CI runners with Node 20).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated feature-detection comments in two files

Low Severity

The two-line comment "zstdCompress is available in Node 22.15+. Feature-detect to avoid crashing…" is repeated verbatim twice in a row — a copy-paste error. The same duplication exists in zstd-transport.ts lines 82–85.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 054955a. Configure here.

// biome-ignore lint/suspicious/noExplicitAny: zstd types unavailable on older @types/node
const zstdCompressFn = (zlib as any).zstdCompress as
| ((...args: unknown[]) => unknown)
| undefined;
const zstdCompressAsync =
typeof zstdCompressFn === "function"
? (promisify(zstdCompressFn) as (
buf: Buffer,
opts?: unknown
) => Promise<Buffer>)
: undefined;
const log = logger.withTag("api.sourcemaps");

// ── Schemas ─────────────────────────────────────────────────────────
Expand Down Expand Up @@ -195,20 +211,22 @@
/**
* Compress a chunk buffer with the chosen codec. Exported for testing.
*
* Both codecs run off-thread (Bun's zstd worker and libuv's zlib thread
* pool), so a chunk being compressed doesn't block the event loop --
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zstd encoding selected without runtime support check

High Severity

pickUploadEncoding returns "zstd" whenever the server advertises it, without checking zstdCompressAsync. On Node < 22.15, encodeChunk silently returns raw bytes (falls through both if branches), but uploadChunk still sets Content-Encoding: zstd. The server then tries to zstd-decompress uncompressed data and fails. Additionally, zipCompression is set to "stored" (no deflate) because a wire codec was selected—compounding the issue.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f652595. Configure here.

* Both codecs run off-thread via libuv's thread pool, so a chunk
* being compressed doesn't block the event loop --
* with `concurrency=8`, eight uploads truly compress in parallel.
*/
export async function encodeChunk(
buf: Buffer,
encoding: UploadEncoding | undefined
): Promise<Uint8Array> {
if (encoding === "zstd") {
if (encoding === "zstd" && zstdCompressAsync) {

Check failure on line 222 in src/lib/api/sourcemaps.ts

View check run for this annotation

@sentry/warden / warden: find-bugs

Uncompressed data sent with `Content-Encoding: zstd` when zstdCompressAsync is unavailable

When `zstdCompressAsync` is `undefined` (Node &lt; 22.15), `encodeChunk` silently returns the raw buffer, but the caller at line 288 still sets `Content-Encoding: zstd` — causing the server to attempt zstd-decompressing uncompressed data and fail the upload.
// L3 is libzstd's default; passed explicitly for self-documenting
// code. L9+ trades ~14% size for 4x compress time and forces the
// server's decoder to allocate 15-30 MiB of window state -- not
// worth it once decode cost is counted.
return await Bun.zstdCompress(buf, { level: 3 });
return await zstdCompressAsync(buf, {
params: { [zlib.constants.ZSTD_c_compressionLevel]: 3 },
});
Comment on lines +222 to +229
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: On Node.js < 22.15, uploadChunk sends uncompressed data with a Content-Encoding: zstd header because encodeChunk silently fails to compress but the header is set regardless.
Severity: HIGH

Suggested Fix

encodeChunk should return an object indicating both the resulting buffer and the encoding that was actually applied (e.g., { data: Buffer, encoding: 'zstd' | 'identity' }). The uploadChunk function should then use this returned encoding to correctly set the Content-Encoding header only when the data is actually compressed. This prevents a mismatch between the payload and its encoding header.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/lib/api/sourcemaps.ts#L222-L229

Potential issue: On Node.js versions older than 22.15, the `zstdCompressAsync` function
is unavailable. When a server advertises `zstd` support, the `encodeChunk` function is
called with `encoding: "zstd"` but falls through and returns an uncompressed buffer
because `zstdCompressAsync` is falsy. However, the `uploadChunk` function proceeds to
set the `Content-Encoding: zstd` header based on the initial `encoding` parameter,
without verifying if compression actually occurred. This results in sending uncompressed
data with a `zstd` header, causing the server to fail during decompression and reject
the sourcemap upload.

}
if (encoding === "gzip") {
// zlib default (L6). Counter-intuitively, lower levels (L1/L5)
Expand Down
118 changes: 64 additions & 54 deletions src/lib/bspatch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,20 +5,10 @@
* TRDIFF10 format (produced by zig-bsdiff with `--use-zstd`). Designed for
* minimal memory usage during CLI self-upgrades:
*
* - Old binary: copy-then-mmap for 0 JS heap (CoW on btrfs/xfs/APFS),
* falling back to `arrayBuffer()` if copy/mmap fails
* - Diff/extra blocks: streamed via `DecompressionStream('zstd')`
* - Output: written incrementally to disk via `Bun.file().writer()`
* - Integrity: SHA-256 computed inline via `Bun.CryptoHasher`
*
* `Bun.mmap()` cannot target the running binary directly because it opens
* with PROT_WRITE/O_RDWR:
* - macOS: AMFI sends uncatchable SIGKILL (writable mapping on signed Mach-O)
* - Linux: ETXTBSY from `open()` (kernel blocks write-open on running ELF)
*
* The copy-then-mmap strategy sidesteps both: the copy is a regular file
* with no running process, so mmap succeeds. On CoW-capable filesystems
* (btrfs, xfs, APFS) the copy is near-instant with zero extra disk I/O.
* - Old binary: copy to temp file, then read via `readFile()` (~100 MB heap)
* - Diff/extra blocks: streamed via zstd `Transform` from `node:zlib`
* - Output: written incrementally to disk via `createWriteStream()`
* - Integrity: SHA-256 computed inline via `node:crypto`
*
* TRDIFF10 format (from zig-bsdiff):
* ```
Expand All @@ -30,10 +20,13 @@
* ```
*/

import { constants, copyFileSync } from "node:fs";
import { createHash } from "node:crypto";
import { constants, copyFileSync, createWriteStream } from "node:fs";
import { readFile, unlink } from "node:fs/promises";
import { tmpdir } from "node:os";
import { join } from "node:path";
import { Readable } from "node:stream";
import { createZstdDecompress, zstdDecompressSync } from "node:zlib";

Check failure on line 29 in src/lib/bspatch.ts

View check run for this annotation

@sentry/warden / warden: find-bugs

[4G5-AL6] Uncompressed data sent with `Content-Encoding: zstd` when zstdCompressAsync is unavailable (additional location)

When `zstdCompressAsync` is `undefined` (Node &lt; 22.15), `encodeChunk` silently returns the raw buffer, but the caller at line 288 still sets `Content-Encoding: zstd` — causing the server to attempt zstd-decompressing uncompressed data and fail the upload.

/** TRDIFF10 header magic bytes */
const TRDIFF10_MAGIC = "TRDIFF10";
Expand Down Expand Up @@ -123,7 +116,7 @@
/**
* Buffered reader over a `ReadableStream` that serves exact byte counts.
*
* Wraps a `DecompressionStream` output reader to provide `read(n)` semantics:
* Wraps a decompression stream output reader to provide `read(n)` semantics:
* pulls chunks from the underlying stream as needed, buffers leftover bytes,
* and returns exactly `n` bytes per call.
*/
Expand Down Expand Up @@ -195,34 +188,41 @@
/**
* Create a streaming zstd decompressor from a compressed buffer.
*
* Wraps the compressed data in a ReadableStream, pipes through
* DecompressionStream('zstd'), and returns a BufferedStreamReader
* for on-demand byte consumption.
* Pipes the compressed data through `node:zlib`'s zstd decompressor and
* returns a BufferedStreamReader for on-demand byte consumption.
*
* @param compressed - Zstd-compressed data
* @returns BufferedStreamReader for incremental decompression
*/
function createZstdStreamReader(compressed: Uint8Array): BufferedStreamReader {
const input = new ReadableStream<Uint8Array>({
// Convert the node:zlib Transform stream into a Web ReadableStream
// so BufferedStreamReader can consume it with the same interface.
const nodeStream = Readable.from(Buffer.from(compressed)).pipe(
createZstdDecompress()
);

const webStream = new ReadableStream<Uint8Array>({
start(controller) {
controller.enqueue(compressed);
controller.close();
nodeStream.on("data", (chunk: Buffer) => {
controller.enqueue(new Uint8Array(chunk));
});
nodeStream.on("end", () => {
controller.close();
});
nodeStream.on("error", (err) => {
controller.error(err);
});
},
});

// Bun supports 'zstd' but the standard CompressionFormat type doesn't include it
const decompressed = input.pipeThrough(
new DecompressionStream("zstd" as "deflate")
);

return new BufferedStreamReader(
decompressed.getReader() as ReadableStreamDefaultReader<Uint8Array>
webStream.getReader() as ReadableStreamDefaultReader<Uint8Array>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zstd bridge decompresses eagerly instead of on-demand

Medium Severity

createZstdStreamReader pipes the full compressed block through a flowing Node stream immediately upon construction, decompressing eagerly. Both diff and extra readers are created (lines 304–307) before loadOldBinary awaits, so both blocks can fully decompress and buffer in memory concurrently—contradicting the module's "minimal memory usage" design. The old DecompressionStream approach was pull-based and only decompressed on demand.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f652595. Configure here.

);
Comment thread
sentry-warden[bot] marked this conversation as resolved.
}

/** Result of loading the old binary for patching */
type OldFileHandle = {
/** Memory-mapped or in-memory view of the old binary */
/** In-memory view of the old binary */
data: Uint8Array;
/** Cleanup function to call after patching (removes temp copy, if any) */
cleanup: () => void | Promise<void>;
Expand All @@ -231,14 +231,10 @@
/**
* Load the old binary for read access during patching.
*
* Strategy: copy to temp file, then try mmap on the copy. The copy is a
* regular file (no running process), so `Bun.mmap()` succeeds on both
* Linux and macOS — ETXTBSY (Linux) and AMFI SIGKILL (macOS) only affect
* the running binary's inode, not a copy. On CoW filesystems (btrfs, xfs,
* APFS) the copy is a metadata-only reflink (near-instant).
*
* Falls back to `Bun.file().arrayBuffer()` (~100 MB heap) if copy or
* mmap fails for any reason.
* Strategy: copy to temp file, then read into memory. The copy avoids
* ETXTBSY (Linux) / AMFI SIGKILL (macOS) issues with reading the running
* binary directly. On CoW filesystems (btrfs, xfs, APFS) the copy is a
* metadata-only reflink (near-instant).
*/
let loadCounter = 0;

Expand All @@ -253,18 +249,15 @@
// silently falls back to regular copy on filesystems that don't support it.
copyFileSync(oldPath, tempCopy, constants.COPYFILE_FICLONE);

// mmap the copy — safe because it's a separate inode, not the running
// binary. MAP_PRIVATE avoids write-back to disk.
const data = Bun.mmap(tempCopy, { shared: false });
return {
data,
data: new Uint8Array(await readFile(tempCopy)),
cleanup: () =>
unlink(tempCopy).catch(() => {
/* Best-effort cleanup — OS will reclaim on reboot */
}),
};
} catch {
// Copy or mmap failed — fall back to reading into JS heap
// Copy failed — read directly into JS heap
await unlink(tempCopy).catch(() => {
/* May not exist if copyFileSync failed */
});
Expand All @@ -280,10 +273,9 @@
/**
* Apply a TRDIFF10 binary patch with streaming I/O for minimal memory usage.
*
* Copies the old file to a temp path and mmaps the copy (0 JS heap), falling
* back to `arrayBuffer()` if mmap fails. Streams diff/extra blocks via
* `DecompressionStream('zstd')`, writes output via `Bun.file().writer()`,
* and computes SHA-256 inline.
* Copies the old file to a temp path and reads it into memory. Streams
* diff/extra blocks via `node:zlib` zstd decompressor, writes output via
* `createWriteStream()`, and computes SHA-256 inline.
*
* @param oldPath - Path to the existing (old) binary file
* @param patchData - Complete TRDIFF10 patch file contents
Expand All @@ -304,7 +296,7 @@
const extraStart = diffStart + diffLen;

// Control block is tiny — decompress fully for random access to tuples
const controlBlock = Bun.zstdDecompressSync(
const controlBlock = zstdDecompressSync(
patchData.subarray(controlStart, diffStart)
);

Expand All @@ -314,14 +306,20 @@
);
const extraReader = createZstdStreamReader(patchData.subarray(extraStart));

// Load old binary via copy-then-mmap (0 JS heap) or arrayBuffer fallback.
// See loadOldBinary() for why direct mmap of the running binary is impossible.
// Load old binary via copy-then-read (or direct read as fallback).
const { data: oldFile, cleanup: cleanupOldFile } =
await loadOldBinary(oldPath);

// Streaming output: write directly to disk, no output buffer in memory
const writer = Bun.file(destPath).writer();
const hasher = new Bun.CryptoHasher("sha256");
// Streaming output: write directly to disk, compute SHA-256 inline
const writer = createWriteStream(destPath);
const hasher = createHash("sha256");

// Capture write errors early — without a listener, Node crashes with
// ERR_UNHANDLED_ERROR if a write fails (ENOSPC, EIO, etc.) during the loop.
let writeError: Error | undefined;
writer.on("error", (err) => {
writeError ??= err;
});

let oldpos = 0;
let newpos = 0;
Expand All @@ -333,6 +331,9 @@
controlPos < controlBlock.byteLength;
controlPos += 24
) {
if (writeError) {
break;
}
const readDiffBy = offtin(controlBlock, controlPos);
const readExtraBy = offtin(controlBlock, controlPos + 8);
const seekBy = offtin(controlBlock, controlPos + 16);
Expand Down Expand Up @@ -367,7 +368,16 @@
}
} finally {
try {
await writer.end();
await new Promise<void>((resolve, reject) => {
writer.end((err?: Error | null) => {
const finalErr = err ?? writeError;
if (finalErr) {
reject(finalErr);
} else {
resolve();
}
});
});
Comment thread
cursor[bot] marked this conversation as resolved.
} finally {
await cleanupOldFile();
}
Comment thread
sentry[bot] marked this conversation as resolved.
Expand All @@ -380,5 +390,5 @@
);
}

return hasher.digest("hex") as string;
return hasher.digest("hex");
}
60 changes: 26 additions & 34 deletions src/lib/telemetry/zstd-transport.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@
* SDK's default behavior byte-for-byte so there's no regression.
*
* Codec selection is one-shot, performed at factory-construction time.
* No per-request branching: if `Bun.zstdCompress` is available when the
* transport is created, every envelope uses zstd; otherwise every
* envelope uses gzip.
* No per-request branching: if `node:zlib` zstd support is available
* when the transport is created, every envelope uses zstd; otherwise
* every envelope uses gzip.
*
* This mirrors `@sentry/node-core/transports/http.js` `makeNodeTransport`
* — URL parsing, `no_proxy` handling, proxy agent, CA certs, keepAlive,
Expand All @@ -32,7 +32,8 @@ import * as http from "node:http";
import * as https from "node:https";
import { Readable } from "node:stream";
import { promisify } from "node:util";
import { gzip as gzipCb } from "node:zlib";
// biome-ignore lint/performance/noNamespaceImport: needed for feature-detected zstd access
import * as zlib from "node:zlib";
import {
createTransport,
suppressTracing,
Expand Down Expand Up @@ -77,20 +78,22 @@ const ZSTD_THRESHOLD = 1024;
*/
const GZIP_THRESHOLD = 1024 * 32;

/**
* Shape of the globalThis.Bun subset we rely on. Bun's real types
* declare this, but the transport also runs under Node (via the
* feature-detected polyfill in `script/node-polyfills.ts`) where only
* a subset of Bun APIs are installed.
*/
type BunZstdHost = {
zstdCompress?: (
data: Uint8Array | Buffer | string | ArrayBuffer,
options?: { level?: number }
) => Promise<Buffer>;
};

const gzipAsync = promisify(gzipCb);
const gzipAsync = promisify(zlib.gzip);
// zstdCompress is available in Node 22.15+. Feature-detect to avoid crashing
// the npm bundle on older Node versions (e.g., CI runners with Node 20).
// zstdCompress is available in Node 22.15+. Feature-detect to avoid crashing
// the npm bundle on older Node versions (e.g., CI runners with Node 20).
// biome-ignore lint/suspicious/noExplicitAny: zstd types unavailable on older @types/node
const zstdCompressFn = (zlib as any).zstdCompress as
| ((...args: unknown[]) => unknown)
| undefined;
const zstdCompressAsync =
typeof zstdCompressFn === "function"
? (promisify(zstdCompressFn) as (
buf: Buffer,
opts?: unknown
) => Promise<Buffer>)
: undefined;

/**
* Factory for the SDK's `Sentry.init({ transport })` option.
Expand Down Expand Up @@ -285,20 +288,10 @@ export async function maybeCompress(
return { payload: buf, encodingApplied: "none" };
}

if (encoding === "zstd") {
// Belt-and-braces — `Bun` may have been swapped out between
// construction and first send. Fall through to gzip if so, but
// re-apply the gzip threshold so mid-sized bodies (1-32 KiB) don't
// get compressed when the SDK default would have shipped them raw.
const bun = (globalThis as { Bun?: BunZstdHost }).Bun;
if (!bun?.zstdCompress) {
if (buf.length <= GZIP_THRESHOLD) {
return { payload: buf, encodingApplied: "none" };
}
const gz = await gzipAsync(buf);
return { payload: gz, encodingApplied: "gzip" };
}
const out = await bun.zstdCompress(buf, { level: ZSTD_LEVEL });
if (encoding === "zstd" && zstdCompressAsync) {
const out = await zstdCompressAsync(buf, {
params: { [zlib.constants.ZSTD_c_compressionLevel]: ZSTD_LEVEL },
});
return {
payload: Buffer.from(out.buffer, out.byteOffset, out.byteLength),
encodingApplied: "zstd",
Expand All @@ -311,8 +304,7 @@ export async function maybeCompress(

/** Feature-detect zstd support on the current runtime. */
export function hasZstdSupport(): boolean {
const bun = (globalThis as { Bun?: BunZstdHost }).Bun;
return typeof bun?.zstdCompress === "function";
return zstdCompressAsync !== undefined;
}

/**
Expand Down
Loading
Loading