Benchmarks Website Version 3 by connortsui20 · Pull Request #7643 · vortex-data/vortex

connortsui20 · 2026-04-26T18:45:00Z

Summary

Rewrites the benchmarks website. Replaces the static data.json.gz model with a single Rust server binary that owns a DuckDB database and accepts POST /api/ingest from CI.

Design

Single binary: axum + maud (SSR HTML) + DuckDB + Chart.js. All static assets include_bytes!'d.
5 fact tables (compression time, query measurement, vector search, RAG, random access). Backup is a file copy.
Ingest: versioned JSON envelopes, bearer-token gated.
Migrator ports v2 history forward via a classifier that routes each record to a fact table or skips it with a typed reason.
Charts/groups slug-addressed, URL round-trip with no DB lookup.
Routes: /, /chart/{slug}, /group/{slug}, GET /api/chart/{slug}.
Deploy: one binary, one DuckDB file, one INGEST_BEARER_TOKEN.

UI/UX is still TBD — the relational backend opens up options we didn't have before.

codspeed-hq · 2026-04-26T18:48:45Z

Merging this PR will degrade performance by 26.23%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 5 improved benchmarks
❌ 3 regressed benchmarks
✅ 1161 untouched benchmarks
⏩ 138 skipped benchmarks¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	Simulation	`patched_take_10k_first_chunk_only`	302.4 µs	272 µs	+11.2%
⚡	Simulation	`take_10k_dispersed`	284.8 µs	239.7 µs	+18.8%
⚡	Simulation	`patched_take_10k_adversarial`	259 µs	228.6 µs	+13.32%
⚡	Simulation	`patched_take_10k_dispersed`	316 µs	285.5 µs	+10.69%
⚡	Simulation	`take_10k_first_chunk_only`	270.8 µs	225.8 µs	+19.92%
❌	Simulation	`bitwise_not_vortex_buffer_mut[1024]`	307.8 ns	366.1 ns	-15.93%
❌	Simulation	`bitwise_not_vortex_buffer_mut[2048]`	371.4 ns	429.7 ns	-13.57%
❌	Simulation	`bitwise_not_vortex_buffer_mut[128]`	246.1 ns	333.6 ns	-26.23%

_{Comparing ct/benchmarks-v3 (fbb0b39) with develop (d3ff1f1)}

138 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

…erver (#7780) ## Summary Prototype website: http://ec2-18-219-54-101.us-east-2.compute.amazonaws.com:3000/ This is the first step we should make before we cut over to the new benchmarks website on #7643 This PR allows the CI actions to additionally post data to a server (on my EC2 instance for now). We want to check that this actually works before we start using this for all of our CI. Note that this does NOT change how the current benchmarks website works, as this just does a few extra things on top of that. Also for reviewers, even though this looks like 1k LoC I think the logic here is not that hard to review, a lot of this is boilerplate you can skim over. Below is a bunch of AI-generated description: read at your own discretion. <details> Brings the v3 emitter and CI dual-write plumbing from `ct/benchmarks-v3` onto `develop` **without** the v3 server/website code. CI continues to write v2 results to S3 unchanged; v3 ingest is a side channel that no-ops until the deploy track sets `vars.V3_INGEST_URL`. This is item 2 ("CI ingestion wiring") of the v3 production-readiness checklist in [`benchmarks-website/planning/README.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/README.md). The v3 website itself ships in a separate PR off `ct/benchmarks-v3` once dual-write is verified healthy in production. ### What's included **Rust emitter (`vortex-bench`)** - New `vortex-bench/src/v3.rs`: one record per `kind` (`query_measurement`, `compression_time`, `compression_size`, `random_access_time`, `vector_search_run`) plus a serde-tagged `V3Record` enum, JSONL writer, and `insta` snapshot tests. Field shapes match [`02-contracts.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/02-contracts.md). - `Dataset::v3_dataset_dims()` (default `(name(), None)`) lets Public-BI map to `(public-bi, <subset>)`. - `compress` and `runner` capture per-iteration timings and provide `SqlBenchmarkRunner::v3_records()`. **Benchmark binaries** - `compress-bench`, `datafusion-bench`, `duckdb-bench`, `lance-bench`, `random-access-bench`, `vector-search-bench` all gain `--gh-json-v3 <path>`. Bare records, no envelope. The legacy `-d gh-json -o ...` flow is untouched. **`bench-orchestrator`** - `vx-bench run --gh-json-v3 <path>` plumbs the flag through to the underlying benchmark binary. **`scripts/post-ingest.py`** (Python 3, stdlib only) - Reads JSONL, fills the `commit` envelope from `git show`, wraps in `{run_meta, commit, records}`, POSTs to `/api/ingest` with `Authorization: Bearer ${INGEST_BEARER_TOKEN}`. Exits non-zero on 4xx/5xx. No retry/spool — deferred. **Workflows** - `.github/workflows/bench.yml` and `sql-benchmarks.yml` add `--gh-json-v3 results.v3.jsonl` to the bench runs and a follow-up "Ingest results to v3 server" step. - New `.github/workflows/v3-commit-metadata.yml` POSTs an empty envelope on every push to `develop` so the v3 `commits` dim stays populated even when no benchmark ran. ### What's NOT included (intentionally) - Anything under `benchmarks-website/` — the v2 React/Node app stays in production unchanged. - Workspace member additions for `benchmarks-website/server` and `benchmarks-website/migrate` — those crates don't exist on `develop` yet. - `.github/workflows/ci.yml` and `publish-bench-server.yml` changes — they reference `vortex-bench-server`, which is also v3-server-only. ## Risk **Zero.** The v3 ingest step is gated on `vars.V3_INGEST_URL != ''` and `continue-on-error: true`. If the V3 server is down, the variable is unset, or the bearer secret is missing, the workflow no-ops and the v2 path keeps writing to S3 unchanged. The Rust emitter writes JSONL to a local file only; no network egress from the binaries themselves. ## Verify A CI run on this branch should show the new "Ingest results to v3 server" step running and POSTing successfully to the EC2 host at `vars.V3_INGEST_URL`. ## Follow-up The v3 website itself (server, migrator, web UI) ships in a separate PR off `ct/benchmarks-v3` once dual-write is verified healthy in production. Outbox-style retry on failed POSTs is also a follow-up — not built until we observe a failure in the wild. ## Test plan - [x] `cargo build -p vortex-bench` — clean. - [x] `cargo nextest run -p vortex-bench` — 49/49 pass, including 7 new v3 snapshot tests. - [x] `cargo build -p compress-bench -p datafusion-bench -p duckdb-bench -p lance-bench -p random-access-bench -p vector-search-bench` — clean. - [x] All six benchmark binaries print `--gh-json-v3 <GH_JSON_V3>` in `--help`. - [x] `python3 scripts/post-ingest.py --help` — clean. - [x] `pytest bench-orchestrator/tests/test_executor.py` — 5/5 pass, including 2 new `gh_json_v3` tests. - [x] `cargo +nightly fmt --all` — no diff. - [x] `cargo clippy --all-targets --all-features -p vortex-bench` — clean. - [x] `cargo clippy --all-targets -p compress-bench -p datafusion-bench -p lance-bench -p random-access-bench -p vector-search-bench` — clean. `duckdb-bench` skipped (transitively triggers a pre-existing `cognitive_complexity` lint in `vortex-duckdb/src/convert/expr.rs:47`, present on `develop` and unrelated to these changes). - [x] `yamllint --strict -c .yamllint.yaml` on the three changed/new workflow files — clean. - [x] `./scripts/public-api.sh` — N/A. All touched Rust crates have `publish = false`. - [ ] Real round-trip against the EC2 host — verifies once this branch triggers a CI bench run with `V3_INGEST_URL` set. --- _Generated by [Claude Code](https://claude.ai/code/session_0154XbxhgQztmbrQfJ4ZSxVo)_ </details> --------- Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>

Rewrites the benchmarks website. Replaces the static `data.json.gz` model with a single Rust server binary that owns a DuckDB database and accepts `POST /api/ingest` from CI. Design: - Single binary: axum + maud (SSR HTML) + DuckDB + Chart.js. All static assets `include_bytes!`'d. - 5 fact tables (compression time, query measurement, vector search, RAG, random access). Backup is a file copy. - Ingest: versioned JSON envelopes, bearer-token gated. - Migrator ports v2 history forward via a classifier that routes each record to a fact table or skips it with a typed reason. - Charts/groups slug-addressed, URL round-trip with no DB lookup. - Routes: `/`, `/chart/{slug}`, `/group/{slug}`, `GET /api/chart/{slug}`. - Deploy: one binary, one DuckDB file, one `INGEST_BEARER_TOKEN`. Signed-off-by: Claude <noreply@anthropic.com>

connortsui20 added the changelog/skip Do not list PR in the changelog label Apr 26, 2026

connortsui20 force-pushed the ct/benchmarks-v3 branch 7 times, most recently from 6444858 to 7113963 Compare April 29, 2026 21:05

connortsui20 force-pushed the ct/benchmarks-v3 branch from 6b76221 to be65b51 Compare May 4, 2026 14:37

connortsui20 mentioned this pull request May 4, 2026

[claude] feat(bench): emit v3 JSONL records and dual-write to bench server #7780

Merged

12 tasks

connortsui20 force-pushed the ct/benchmarks-v3 branch from 8a1e63c to da42f9c Compare May 4, 2026 19:55

lwwmanning approved these changes May 4, 2026

View reviewed changes

connortsui20 force-pushed the ct/benchmarks-v3 branch from 3888ec0 to fbb0b39 Compare May 4, 2026 22:15

lwwmanning marked this pull request as ready for review May 4, 2026 22:17

connortsui20 enabled auto-merge (squash) May 4, 2026 22:20

connortsui20 merged commit e0a2bdf into develop May 4, 2026
60 of 62 checks passed

connortsui20 deleted the ct/benchmarks-v3 branch May 4, 2026 22:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks Website Version 3#7643

Benchmarks Website Version 3#7643
connortsui20 merged 1 commit into
developfrom
ct/benchmarks-v3

connortsui20 commented Apr 26, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Apr 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

connortsui20 commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design

Uh oh!

codspeed-hq Bot commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 26.23%

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

connortsui20 commented Apr 26, 2026 •

edited

Loading

codspeed-hq Bot commented Apr 26, 2026 •

edited

Loading