Benchmarks Website Version 3#7643
Merged
Merged
Conversation
Merging this PR will degrade performance by 26.23%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | patched_take_10k_first_chunk_only |
302.4 µs | 272 µs | +11.2% |
| ⚡ | Simulation | take_10k_dispersed |
284.8 µs | 239.7 µs | +18.8% |
| ⚡ | Simulation | patched_take_10k_adversarial |
259 µs | 228.6 µs | +13.32% |
| ⚡ | Simulation | patched_take_10k_dispersed |
316 µs | 285.5 µs | +10.69% |
| ⚡ | Simulation | take_10k_first_chunk_only |
270.8 µs | 225.8 µs | +19.92% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
307.8 ns | 366.1 ns | -15.93% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[2048] |
371.4 ns | 429.7 ns | -13.57% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[128] |
246.1 ns | 333.6 ns | -26.23% |
Comparing ct/benchmarks-v3 (fbb0b39) with develop (d3ff1f1)
Footnotes
-
138 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
6444858 to
7113963
Compare
6b76221 to
be65b51
Compare
12 tasks
8a1e63c to
da42f9c
Compare
connortsui20
added a commit
that referenced
this pull request
May 4, 2026
…erver (#7780) ## Summary Prototype website: http://ec2-18-219-54-101.us-east-2.compute.amazonaws.com:3000/ This is the first step we should make before we cut over to the new benchmarks website on #7643 This PR allows the CI actions to additionally post data to a server (on my EC2 instance for now). We want to check that this actually works before we start using this for all of our CI. Note that this does NOT change how the current benchmarks website works, as this just does a few extra things on top of that. Also for reviewers, even though this looks like 1k LoC I think the logic here is not that hard to review, a lot of this is boilerplate you can skim over. Below is a bunch of AI-generated description: read at your own discretion. <details> Brings the v3 emitter and CI dual-write plumbing from `ct/benchmarks-v3` onto `develop` **without** the v3 server/website code. CI continues to write v2 results to S3 unchanged; v3 ingest is a side channel that no-ops until the deploy track sets `vars.V3_INGEST_URL`. This is item 2 ("CI ingestion wiring") of the v3 production-readiness checklist in [`benchmarks-website/planning/README.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/README.md). The v3 website itself ships in a separate PR off `ct/benchmarks-v3` once dual-write is verified healthy in production. ### What's included **Rust emitter (`vortex-bench`)** - New `vortex-bench/src/v3.rs`: one record per `kind` (`query_measurement`, `compression_time`, `compression_size`, `random_access_time`, `vector_search_run`) plus a serde-tagged `V3Record` enum, JSONL writer, and `insta` snapshot tests. Field shapes match [`02-contracts.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/02-contracts.md). - `Dataset::v3_dataset_dims()` (default `(name(), None)`) lets Public-BI map to `(public-bi, <subset>)`. - `compress` and `runner` capture per-iteration timings and provide `SqlBenchmarkRunner::v3_records()`. **Benchmark binaries** - `compress-bench`, `datafusion-bench`, `duckdb-bench`, `lance-bench`, `random-access-bench`, `vector-search-bench` all gain `--gh-json-v3 <path>`. Bare records, no envelope. The legacy `-d gh-json -o ...` flow is untouched. **`bench-orchestrator`** - `vx-bench run --gh-json-v3 <path>` plumbs the flag through to the underlying benchmark binary. **`scripts/post-ingest.py`** (Python 3, stdlib only) - Reads JSONL, fills the `commit` envelope from `git show`, wraps in `{run_meta, commit, records}`, POSTs to `/api/ingest` with `Authorization: Bearer ${INGEST_BEARER_TOKEN}`. Exits non-zero on 4xx/5xx. No retry/spool — deferred. **Workflows** - `.github/workflows/bench.yml` and `sql-benchmarks.yml` add `--gh-json-v3 results.v3.jsonl` to the bench runs and a follow-up "Ingest results to v3 server" step. - New `.github/workflows/v3-commit-metadata.yml` POSTs an empty envelope on every push to `develop` so the v3 `commits` dim stays populated even when no benchmark ran. ### What's NOT included (intentionally) - Anything under `benchmarks-website/` — the v2 React/Node app stays in production unchanged. - Workspace member additions for `benchmarks-website/server` and `benchmarks-website/migrate` — those crates don't exist on `develop` yet. - `.github/workflows/ci.yml` and `publish-bench-server.yml` changes — they reference `vortex-bench-server`, which is also v3-server-only. ## Risk **Zero.** The v3 ingest step is gated on `vars.V3_INGEST_URL != ''` and `continue-on-error: true`. If the V3 server is down, the variable is unset, or the bearer secret is missing, the workflow no-ops and the v2 path keeps writing to S3 unchanged. The Rust emitter writes JSONL to a local file only; no network egress from the binaries themselves. ## Verify A CI run on this branch should show the new "Ingest results to v3 server" step running and POSTing successfully to the EC2 host at `vars.V3_INGEST_URL`. ## Follow-up The v3 website itself (server, migrator, web UI) ships in a separate PR off `ct/benchmarks-v3` once dual-write is verified healthy in production. Outbox-style retry on failed POSTs is also a follow-up — not built until we observe a failure in the wild. ## Test plan - [x] `cargo build -p vortex-bench` — clean. - [x] `cargo nextest run -p vortex-bench` — 49/49 pass, including 7 new v3 snapshot tests. - [x] `cargo build -p compress-bench -p datafusion-bench -p duckdb-bench -p lance-bench -p random-access-bench -p vector-search-bench` — clean. - [x] All six benchmark binaries print `--gh-json-v3 <GH_JSON_V3>` in `--help`. - [x] `python3 scripts/post-ingest.py --help` — clean. - [x] `pytest bench-orchestrator/tests/test_executor.py` — 5/5 pass, including 2 new `gh_json_v3` tests. - [x] `cargo +nightly fmt --all` — no diff. - [x] `cargo clippy --all-targets --all-features -p vortex-bench` — clean. - [x] `cargo clippy --all-targets -p compress-bench -p datafusion-bench -p lance-bench -p random-access-bench -p vector-search-bench` — clean. `duckdb-bench` skipped (transitively triggers a pre-existing `cognitive_complexity` lint in `vortex-duckdb/src/convert/expr.rs:47`, present on `develop` and unrelated to these changes). - [x] `yamllint --strict -c .yamllint.yaml` on the three changed/new workflow files — clean. - [x] `./scripts/public-api.sh` — N/A. All touched Rust crates have `publish = false`. - [ ] Real round-trip against the EC2 host — verifies once this branch triggers a CI bench run with `V3_INGEST_URL` set. --- _Generated by [Claude Code](https://claude.ai/code/session_0154XbxhgQztmbrQfJ4ZSxVo)_ </details> --------- Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
lwwmanning
approved these changes
May 4, 2026
Rewrites the benchmarks website. Replaces the static `data.json.gz` model
with a single Rust server binary that owns a DuckDB database and accepts
`POST /api/ingest` from CI.
Design:
- Single binary: axum + maud (SSR HTML) + DuckDB + Chart.js. All static
assets `include_bytes!`'d.
- 5 fact tables (compression time, query measurement, vector search, RAG,
random access). Backup is a file copy.
- Ingest: versioned JSON envelopes, bearer-token gated.
- Migrator ports v2 history forward via a classifier that routes each
record to a fact table or skips it with a typed reason.
- Charts/groups slug-addressed, URL round-trip with no DB lookup.
- Routes: `/`, `/chart/{slug}`, `/group/{slug}`, `GET /api/chart/{slug}`.
- Deploy: one binary, one DuckDB file, one `INGEST_BEARER_TOKEN`.
Signed-off-by: Claude <noreply@anthropic.com>
3888ec0 to
fbb0b39
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrites the benchmarks website. Replaces the static
data.json.gzmodel with a single Rust server binary that owns a DuckDB database and acceptsPOST /api/ingestfrom CI.Design
include_bytes!'d./,/chart/{slug},/group/{slug},GET /api/chart/{slug}.INGEST_BEARER_TOKEN.UI/UX is still TBD — the relational backend opens up options we didn't have before.