[SPARK-57069][INFRA] Share SBT precompile artifact with docker/k8s integration test CI jobs#56110
Conversation
CI performance: before vs afterComparing per-job wall time on real CI runs (n=2 BEFORE, n=2 AFTER):
Samples:
* The AFTER-2 k8s total looks ~10m worse, but step-level breakdown shows ~7m of it is unrelated Reading the result
Net per scheduled runDocker savings (~16m) are real and consistent. K8s savings exist but are not measurable above CI variance. Even where the wall-clock impact on k8s is small, the change is no-cost (precompile is already running for pyspark / sparkr / build), and the silent fallback means worst case is degraded to the pre-PR behavior. |
…t CI jobs Generated-by: Claude Code (Opus 4.7)
…n precompile
Extends the precompile invocation to also build the kubernetes-integration-tests
submodule and the SparkR Scala bindings. With both included, the k8s
integration test job's SBT call ('build/sbt -Phadoop-3 -Psparkr -Pkubernetes
-Pvolcano -Pkubernetes-integration-tests ... kubernetes-integration-tests/test')
sees compiled classes for every active profile in the extracted target/ and
only runs the test phase rather than compiling those modules first.
Generated-by: Claude Code (Opus 4.7)
d5d0bff to
a5c3157
Compare
The previous followup added -Psparkr to the precompile SBT invocation, but -Psparkr activates 'core/buildRPackage' which shells out to R's install-dev.sh to build the SparkR R package. The precompile runner does not have R installed, so the task fails with 'Nonzero exit value: 1' (see PR run 26493097995). Keeping the runner R-free is cheaper than installing R for every consumer of the precompile artifact, since the only saving is ~30-60s of Scala compile on the small SparkR module, and the consumers that activate -Psparkr (sparkr, k8s-integration-tests) install R themselves and rebuild that module incrementally on top of the extracted target/. -Pkubernetes-integration-tests stays in the precompile. Generated-by: Claude Code (Opus 4.7)
…tegration test CI jobs ### What changes were proposed in this pull request? This PR extends the SBT precompile-sharing pattern (parent: [SPARK-56830](https://issues.apache.org/jira/browse/SPARK-56830); prior sub-tasks: [SPARK-56768](https://issues.apache.org/jira/browse/SPARK-56768) pyspark, [SPARK-56831](https://issues.apache.org/jira/browse/SPARK-56831) sparkr, [SPARK-56943](https://issues.apache.org/jira/browse/SPARK-56943) JVM build) to the two remaining SBT-compiling jobs in `.github/workflows/build_and_test.yml` that still run their own full Spark compile: - `docker-integration-tests` - `k8s-integration-tests` Concretely: - The existing `precompile` job's `if:` gate is extended to also fire when `docker-integration-tests == 'true'` or `k8s-integration-tests == 'true'` in the precondition output, so the artifact is available whenever either job needs it. - The precompile SBT invocation adds `-Pkubernetes-integration-tests`, so the integration-tests submodule's `target/` ends up in the shared artifact and the k8s job doesn't have to recompile it. - `docker-integration-tests`: - `needs: precondition` -> `needs: [precondition, precompile]` - `if:` extended with `(!cancelled()) &&` so the job still runs if precompile is cancelled. - Adds "Download precompiled artifact" + "Extract precompiled artifact" steps between Java setup and `Run tests`, with graceful fallback (`continue-on-error: true`). - `Run tests` exports `SKIP_SCALA_BUILD=true` when extraction succeeded; `dev/run-tests.py` already honors this flag and skips `build_apache_spark` + `build_spark_assembly_sbt`. - `k8s-integration-tests`: - Same `needs:` and `if:` change. - Adds the same Download/Extract steps after Java setup. - The actual test runs via a direct `build/sbt ... "kubernetes-integration-tests/test"` call rather than `dev/run-tests.py`, so no `SKIP_SCALA_BUILD` is set. SBT sees the extracted `target/` and skips compilation of the pre-built modules (Spark Core, SQL, kubernetes, integration-tests, ...); only the small SparkR Scala bindings still compile (the precompile doesn't include `-Psparkr` because that profile activates `core/buildRPackage`, which shells out to R, and the precompile runner doesn't have R installed). ### Optional: graceful fallback if precompile fails Same pattern as the prior sub-tasks: - `precompile` keeps `continue-on-error: true`. - Both consumers' "Download precompiled artifact" step is gated on `needs.precompile.result == 'success'` and has `continue-on-error: true`. - "Extract precompiled artifact" is gated on the download succeeding and has `continue-on-error: true`. - For docker, `SKIP_SCALA_BUILD=true` is exported only when `steps.extract-precompiled.outcome == 'success'`; otherwise `dev/run-tests.py` runs the original local SBT build. - For k8s, if extraction fails, SBT compiles from scratch as before. Worst case is degraded to the pre-PR behavior, not a workflow failure. ### Profile coverage The precompile job runs: ``` ./build/sbt -Phadoop-3 -Pyarn -Pspark-ganglia-lgpl -Phadoop-cloud -Phive \ -Pkubernetes -Pjvm-profiler -Pkinesis-asl -Phive-thriftserver \ -Pdocker-integration-tests -Pkubernetes-integration-tests -Pvolcano \ Test/package streaming-kinesis-asl-assembly/assembly connect/assembly assembly/package ``` - `docker-integration-tests`: profile is in the precompile invocation; the module's `target/` is pre-built, so `dev/run-tests --modules docker-integration-tests` only runs the test phase. - `k8s-integration-tests`: `-Pkubernetes` and `-Pkubernetes-integration-tests` are both in the precompile, so the integration-tests submodule is pre-built. The job's direct SBT call adds `-Psparkr`, which triggers compile of the small SparkR Scala bindings on top of the reused `target/`. Net work in this job drops from "compile all of Spark + integration tests + sparkr" to "compile only the sparkr module". ### Why are the changes needed? Today every scheduled / dispatched run of `build_and_test.yml` that requires `docker-integration-tests` or `k8s-integration-tests` re-runs the same SBT compile that `precompile` already produced for `pyspark` / `sparkr` / `build`. Wiring these two consumers to the existing artifact removes that duplicate work for free (precompile is already running). ### Does this PR introduce _any_ user-facing change? No. CI infrastructure change only. ### How was this patch tested? The change is exercised by the CI run of this PR itself. The Download/Extract steps log artifact size; the Run tests step prints `Reusing precompiled artifact, skipping local SBT build.` for the docker job when the fast path is taken. If the precompile job is forced to fail (or its artifact is missing), both consumers fall back to the original local SBT build. Measured CI timings before vs after are posted as a comment on this PR. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7) Closes #56110 from zhengruifeng/share-precompile-integration-tests-dev5. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com> (cherry picked from commit b96b633) Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
…tegration test CI jobs ### What changes were proposed in this pull request? This PR extends the SBT precompile-sharing pattern (parent: [SPARK-56830](https://issues.apache.org/jira/browse/SPARK-56830); prior sub-tasks: [SPARK-56768](https://issues.apache.org/jira/browse/SPARK-56768) pyspark, [SPARK-56831](https://issues.apache.org/jira/browse/SPARK-56831) sparkr, [SPARK-56943](https://issues.apache.org/jira/browse/SPARK-56943) JVM build) to the two remaining SBT-compiling jobs in `.github/workflows/build_and_test.yml` that still run their own full Spark compile: - `docker-integration-tests` - `k8s-integration-tests` Concretely: - The existing `precompile` job's `if:` gate is extended to also fire when `docker-integration-tests == 'true'` or `k8s-integration-tests == 'true'` in the precondition output, so the artifact is available whenever either job needs it. - The precompile SBT invocation adds `-Pkubernetes-integration-tests`, so the integration-tests submodule's `target/` ends up in the shared artifact and the k8s job doesn't have to recompile it. - `docker-integration-tests`: - `needs: precondition` -> `needs: [precondition, precompile]` - `if:` extended with `(!cancelled()) &&` so the job still runs if precompile is cancelled. - Adds "Download precompiled artifact" + "Extract precompiled artifact" steps between Java setup and `Run tests`, with graceful fallback (`continue-on-error: true`). - `Run tests` exports `SKIP_SCALA_BUILD=true` when extraction succeeded; `dev/run-tests.py` already honors this flag and skips `build_apache_spark` + `build_spark_assembly_sbt`. - `k8s-integration-tests`: - Same `needs:` and `if:` change. - Adds the same Download/Extract steps after Java setup. - The actual test runs via a direct `build/sbt ... "kubernetes-integration-tests/test"` call rather than `dev/run-tests.py`, so no `SKIP_SCALA_BUILD` is set. SBT sees the extracted `target/` and skips compilation of the pre-built modules (Spark Core, SQL, kubernetes, integration-tests, ...); only the small SparkR Scala bindings still compile (the precompile doesn't include `-Psparkr` because that profile activates `core/buildRPackage`, which shells out to R, and the precompile runner doesn't have R installed). ### Optional: graceful fallback if precompile fails Same pattern as the prior sub-tasks: - `precompile` keeps `continue-on-error: true`. - Both consumers' "Download precompiled artifact" step is gated on `needs.precompile.result == 'success'` and has `continue-on-error: true`. - "Extract precompiled artifact" is gated on the download succeeding and has `continue-on-error: true`. - For docker, `SKIP_SCALA_BUILD=true` is exported only when `steps.extract-precompiled.outcome == 'success'`; otherwise `dev/run-tests.py` runs the original local SBT build. - For k8s, if extraction fails, SBT compiles from scratch as before. Worst case is degraded to the pre-PR behavior, not a workflow failure. ### Profile coverage The precompile job runs: ``` ./build/sbt -Phadoop-3 -Pyarn -Pspark-ganglia-lgpl -Phadoop-cloud -Phive \ -Pkubernetes -Pjvm-profiler -Pkinesis-asl -Phive-thriftserver \ -Pdocker-integration-tests -Pkubernetes-integration-tests -Pvolcano \ Test/package streaming-kinesis-asl-assembly/assembly connect/assembly assembly/package ``` - `docker-integration-tests`: profile is in the precompile invocation; the module's `target/` is pre-built, so `dev/run-tests --modules docker-integration-tests` only runs the test phase. - `k8s-integration-tests`: `-Pkubernetes` and `-Pkubernetes-integration-tests` are both in the precompile, so the integration-tests submodule is pre-built. The job's direct SBT call adds `-Psparkr`, which triggers compile of the small SparkR Scala bindings on top of the reused `target/`. Net work in this job drops from "compile all of Spark + integration tests + sparkr" to "compile only the sparkr module". ### Why are the changes needed? Today every scheduled / dispatched run of `build_and_test.yml` that requires `docker-integration-tests` or `k8s-integration-tests` re-runs the same SBT compile that `precompile` already produced for `pyspark` / `sparkr` / `build`. Wiring these two consumers to the existing artifact removes that duplicate work for free (precompile is already running). ### Does this PR introduce _any_ user-facing change? No. CI infrastructure change only. ### How was this patch tested? The change is exercised by the CI run of this PR itself. The Download/Extract steps log artifact size; the Run tests step prints `Reusing precompiled artifact, skipping local SBT build.` for the docker job when the fast path is taken. If the precompile job is forced to fail (or its artifact is missing), both consumers fall back to the original local SBT build. Measured CI timings before vs after are posted as a comment on this PR. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (Opus 4.7) Closes #56110 from zhengruifeng/share-precompile-integration-tests-dev5. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com> (cherry picked from commit b96b633) Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
|
thanks, merged to master/4.x/4.2 |
What changes were proposed in this pull request?
This PR extends the SBT precompile-sharing pattern (parent: SPARK-56830; prior sub-tasks: SPARK-56768 pyspark, SPARK-56831 sparkr, SPARK-56943 JVM build) to the two remaining SBT-compiling jobs in
.github/workflows/build_and_test.ymlthat still run their own full Spark compile:docker-integration-testsk8s-integration-testsConcretely:
precompilejob'sif:gate is extended to also fire whendocker-integration-tests == 'true'ork8s-integration-tests == 'true'in the precondition output, so the artifact is available whenever either job needs it.-Pkubernetes-integration-tests, so the integration-tests submodule'starget/ends up in the shared artifact and the k8s job doesn't have to recompile it.docker-integration-tests:needs: precondition->needs: [precondition, precompile]if:extended with(!cancelled()) &&so the job still runs if precompile is cancelled.Run tests, with graceful fallback (continue-on-error: true).Run testsexportsSKIP_SCALA_BUILD=truewhen extraction succeeded;dev/run-tests.pyalready honors this flag and skipsbuild_apache_spark+build_spark_assembly_sbt.k8s-integration-tests:needs:andif:change.build/sbt ... "kubernetes-integration-tests/test"call rather thandev/run-tests.py, so noSKIP_SCALA_BUILDis set. SBT sees the extractedtarget/and skips compilation of the pre-built modules (Spark Core, SQL, kubernetes, integration-tests, ...); only the small SparkR Scala bindings still compile (the precompile doesn't include-Psparkrbecause that profile activatescore/buildRPackage, which shells out to R, and the precompile runner doesn't have R installed).Optional: graceful fallback if precompile fails
Same pattern as the prior sub-tasks:
precompilekeepscontinue-on-error: true.needs.precompile.result == 'success'and hascontinue-on-error: true.continue-on-error: true.SKIP_SCALA_BUILD=trueis exported only whensteps.extract-precompiled.outcome == 'success'; otherwisedev/run-tests.pyruns the original local SBT build.Worst case is degraded to the pre-PR behavior, not a workflow failure.
Profile coverage
The precompile job runs:
docker-integration-tests: profile is in the precompile invocation; the module'starget/is pre-built, sodev/run-tests --modules docker-integration-testsonly runs the test phase.k8s-integration-tests:-Pkubernetesand-Pkubernetes-integration-testsare both in the precompile, so the integration-tests submodule is pre-built. The job's direct SBT call adds-Psparkr, which triggers compile of the small SparkR Scala bindings on top of the reusedtarget/. Net work in this job drops from "compile all of Spark + integration tests + sparkr" to "compile only the sparkr module".Why are the changes needed?
Today every scheduled / dispatched run of
build_and_test.ymlthat requiresdocker-integration-testsork8s-integration-testsre-runs the same SBT compile thatprecompilealready produced forpyspark/sparkr/build. Wiring these two consumers to the existing artifact removes that duplicate work for free (precompile is already running).Does this PR introduce any user-facing change?
No. CI infrastructure change only.
How was this patch tested?
The change is exercised by the CI run of this PR itself. The Download/Extract steps log artifact size; the Run tests step prints
Reusing precompiled artifact, skipping local SBT build.for the docker job when the fast path is taken. If the precompile job is forced to fail (or its artifact is missing), both consumers fall back to the original local SBT build.Measured CI timings before vs after are posted as a comment on this PR.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7)