diff --git a/.agents/skills/debug-openshell-cluster/SKILL.md b/.agents/skills/debug-openshell-cluster/SKILL.md index aeaa503f7..9c171bb97 100644 --- a/.agents/skills/debug-openshell-cluster/SKILL.md +++ b/.agents/skills/debug-openshell-cluster/SKILL.md @@ -101,7 +101,7 @@ Common findings: - Sandbox image missing or pull denied: verify image reference and registry credentials. - Docker driver cannot initialize because it cannot find `openshell-sandbox`: verify `OPENSHELL_DOCKER_SUPERVISOR_BIN`, the sibling binary next to `openshell-gateway`, or the configured supervisor image contains `/openshell-sandbox`. - Sandbox never registers: check gateway logs and supervisor callback endpoint. -- Supervisor image exits before printing `openshell-sandbox --version`: the image should be the scratch supervisor image from `deploy/docker/Dockerfile.supervisor` and must contain a static executable at `/openshell-sandbox`. +- Supervisor image exits before printing `openshell-sandbox --version`: the image should be the scratch supervisor image from `deploy/container/Dockerfile.supervisor` and must contain a static executable at `/openshell-sandbox`. - `mise run e2e:docker:gpu` fails with `docker info --format json did not report any discovered NVIDIA CDI GPU devices`: Docker may report `CDISpecDirs` while still having no generated NVIDIA CDI specs. Verify `.DiscoveredDevices` contains entries such as `nvidia.com/gpu=all`, verify `/etc/cdi` or `/var/run/cdi` contains a generated NVIDIA spec, and check that `nvidia-cdi-refresh.service` and `nvidia-cdi-refresh.path` from NVIDIA Container Toolkit are enabled and healthy. The service is a one-shot unit, so `inactive (dead)` can be normal after a successful run; use `systemctl status` and `journalctl` to distinguish success from a skipped or failed refresh. NVIDIA recommends enabling the path and service units, and restarting `nvidia-cdi-refresh.service` to regenerate missing or stale CDI specs. If specs are generated but Docker still reports no discovered devices, restart Docker or reload the daemon and re-check `docker info`. For source checkout development, restart the local gateway with: @@ -170,11 +170,11 @@ kubectl -n openshell get statefulset openshell -o jsonpath="{.spec.template.spec helm -n openshell get values openshell | grep -E 'repository|tag|supervisorImage' ``` -The gateway image built from `deploy/docker/Dockerfile.gateway` and the scratch supervisor image built from `deploy/docker/Dockerfile.supervisor` should use the same build tag in branch and E2E deploys. A stale supervisor image can make sandbox behavior lag behind gateway policy or proto changes. +The gateway image built from `deploy/container/Dockerfile.gateway` and the scratch supervisor image built from `deploy/container/Dockerfile.supervisor` should use the same build tag in branch and E2E deploys. A stale supervisor image can make sandbox behavior lag behind gateway policy or proto changes. For local/external pull mode (the default local path via `mise run cluster`), local images are tagged to the configured local registry base, pushed to that registry, and pulled by k3s via the `registries.yaml` mirror endpoint. The `cluster` task pushes prebuilt local tags (`openshell/*:dev`, falling back to `localhost:5000/openshell/*:dev` or `127.0.0.1:5000/openshell/*:dev`). -Gateway image builds stage a partial Rust workspace from `deploy/docker/Dockerfile.images`. If cargo fails with a missing manifest under `/build/crates/...`, or an imported symbol exists locally but is missing in the image build, verify that every current gateway dependency crate, including `openshell-driver-docker`, `openshell-driver-kubernetes`, and `openshell-ocsf`, is copied into the staged workspace there. +Gateway image builds stage a partial Rust workspace from `deploy/container/Dockerfile.images`. If cargo fails with a missing manifest under `/build/crates/...`, or an imported symbol exists locally but is missing in the image build, verify that every current gateway dependency crate, including `openshell-driver-docker`, `openshell-driver-kubernetes`, and `openshell-ocsf`, is copied into the staged workspace there. For plaintext local evaluation, confirm the chart has: diff --git a/.claude/agent-memory/arch-doc-writer/MEMORY.md b/.claude/agent-memory/arch-doc-writer/MEMORY.md index 1fce46001..7516707e1 100644 --- a/.claude/agent-memory/arch-doc-writer/MEMORY.md +++ b/.claude/agent-memory/arch-doc-writer/MEMORY.md @@ -62,7 +62,7 @@ - Four runtime images: sandbox (5 stages), gateway (2 stages), cluster (k3s base), pki-job (Alpine) - Two build-only images: python-wheels (Linux multi-arch), python-wheels-macos (osxcross cross-compile) - CI image: Dockerfile.ci (Ubuntu 24.04, pre-installs docker/buildx/aws/kubectl/helm/mise/uv/sccache/socat) -- Cross-compilation: `deploy/docker/cross-build.sh` shared by sandbox + gateway Dockerfiles +- Cross-compilation: `deploy/container/cross-build.sh` shared by sandbox + gateway Dockerfiles - Sandbox image has coding-agents stage: Claude CLI (native installer), OpenCode, Codex (npm) - Helm chart deploys a StatefulSet (NOT Deployment), PVC 1Gi at /var/openshell - Cluster image does NOT bundle image tarballs -- components pulled at runtime from distribution registry diff --git a/.github/workflows/ci-image.yml b/.github/workflows/ci-image.yml index d8d3095f0..b1af8fc4f 100644 --- a/.github/workflows/ci-image.yml +++ b/.github/workflows/ci-image.yml @@ -4,7 +4,7 @@ on: push: branches: [main] paths: - - 'deploy/docker/Dockerfile.ci' + - 'deploy/container/Dockerfile.ci' - 'mise.toml' - 'mise.lock' - 'tasks/**' @@ -72,7 +72,7 @@ jobs: --cache-to "type=gha,mode=max,scope=ci-image-${{ matrix.arch }}" \ --push \ -t "$ARCH_IMAGE" \ - -f deploy/docker/Dockerfile.ci \ + -f deploy/container/Dockerfile.ci \ . - name: Smoke check CI image diff --git a/.github/workflows/docker-build.yml b/.github/workflows/docker-build.yml index 4ff8f501d..eb28c8438 100644 --- a/.github/workflows/docker-build.yml +++ b/.github/workflows/docker-build.yml @@ -222,7 +222,7 @@ jobs: set -euo pipefail binary="${{ needs.resolve.outputs.binary_name }}" download_dir="prebuilt-rust-binary" - stage="deploy/docker/.build/prebuilt-binaries/${{ matrix.arch }}" + stage="deploy/container/.build/prebuilt-binaries/${{ matrix.arch }}" found="$(find "$download_dir" -type f -name "$binary" -print -quit)" if [[ -z "$found" ]]; then echo "missing downloaded artifact file: $binary" >&2 @@ -238,7 +238,7 @@ jobs: DOCKER_BUILDER: openshell run: | set -euo pipefail - mise exec -- tasks/scripts/docker-build-image.sh "${{ inputs.component }}" \ + mise exec -- tasks/scripts/container-build-image.sh "${{ inputs.component }}" \ --cache-from "type=gha,scope=${{ inputs.component }}-${{ matrix.arch }}" \ --cache-to "type=gha,mode=max,scope=${{ inputs.component }}-${{ matrix.arch }}" diff --git a/.github/workflows/driver-vm-macos.yml b/.github/workflows/driver-vm-macos.yml index a563c972c..ee4715c20 100644 --- a/.github/workflows/driver-vm-macos.yml +++ b/.github/workflows/driver-vm-macos.yml @@ -193,7 +193,7 @@ jobs: run: | set -euo pipefail docker buildx build \ - --file deploy/docker/Dockerfile.driver-vm-macos \ + --file deploy/container/Dockerfile.driver-vm-macos \ --build-arg OPENSHELL_CARGO_VERSION="${{ inputs['cargo-version'] }}" \ --build-arg OPENSHELL_IMAGE_TAG="${{ inputs['image-tag'] }}" \ --build-arg CARGO_TARGET_CACHE_SCOPE="${{ github.sha }}" \ diff --git a/.github/workflows/release-dev.yml b/.github/workflows/release-dev.yml index 841d7f6ab..e32de4302 100644 --- a/.github/workflows/release-dev.yml +++ b/.github/workflows/release-dev.yml @@ -352,7 +352,7 @@ jobs: run: | set -euo pipefail docker buildx build \ - --file deploy/docker/Dockerfile.cli-macos \ + --file deploy/container/Dockerfile.cli-macos \ --build-arg OPENSHELL_CARGO_VERSION="${{ needs.compute-versions.outputs.cargo_version }}" \ --build-arg OPENSHELL_IMAGE_TAG=dev \ --build-arg CARGO_TARGET_CACHE_SCOPE="${{ github.sha }}" \ @@ -512,7 +512,7 @@ jobs: run: | set -euo pipefail docker buildx build \ - --file deploy/docker/Dockerfile.gateway-macos \ + --file deploy/container/Dockerfile.gateway-macos \ --build-arg OPENSHELL_CARGO_VERSION="${{ needs.compute-versions.outputs.cargo_version }}" \ --build-arg OPENSHELL_IMAGE_TAG=dev \ --build-arg CARGO_TARGET_CACHE_SCOPE="${{ github.sha }}" \ diff --git a/.github/workflows/release-tag.yml b/.github/workflows/release-tag.yml index ad546596b..a4bb4eb43 100644 --- a/.github/workflows/release-tag.yml +++ b/.github/workflows/release-tag.yml @@ -385,7 +385,7 @@ jobs: run: | set -euo pipefail docker buildx build \ - --file deploy/docker/Dockerfile.cli-macos \ + --file deploy/container/Dockerfile.cli-macos \ --build-arg OPENSHELL_CARGO_VERSION="${{ needs.compute-versions.outputs.cargo_version }}" \ --build-arg OPENSHELL_IMAGE_TAG="${{ needs.compute-versions.outputs.semver }}" \ --build-arg CARGO_TARGET_CACHE_SCOPE="${{ github.sha }}" \ @@ -631,7 +631,7 @@ jobs: run: | set -euo pipefail docker buildx build \ - --file deploy/docker/Dockerfile.gateway-macos \ + --file deploy/container/Dockerfile.gateway-macos \ --build-arg OPENSHELL_CARGO_VERSION="${{ needs.compute-versions.outputs.cargo_version }}" \ --build-arg OPENSHELL_IMAGE_TAG="${{ needs.compute-versions.outputs.semver }}" \ --build-arg CARGO_TARGET_CACHE_SCOPE="${{ github.sha }}" \ diff --git a/.gitignore b/.gitignore index fb8679fa7..804324a99 100644 --- a/.gitignore +++ b/.gitignore @@ -185,7 +185,7 @@ _build/ rootfs/ # Docker build artifacts (image tarballs, packaged helm charts) -deploy/docker/.build/ +deploy/container/.build/ # Helm subchart tarballs (regenerated by `helm dependency build`) deploy/helm/openshell/charts/ diff --git a/AGENTS.md b/AGENTS.md index 2d5f293fc..3de124d74 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -178,9 +178,9 @@ ocsf_emit!(event); - Always use `uv` for Python commands (e.g., `uv pip install`, `uv run`, `uv venv`) -## Docker +## Containers -- Always prefer `mise` commands over direct docker builds (e.g., `mise run docker:build` instead of `docker build`) +- Always prefer `mise` commands over direct container builds (e.g., `mise run build:container` instead of `docker build` or `podman build`) ## Cluster Infrastructure Changes diff --git a/TESTING.md b/TESTING.md index 7bcf2d203..5460d1b90 100644 --- a/TESTING.md +++ b/TESTING.md @@ -154,7 +154,7 @@ Suites: GPU device-selection tests compare OpenShell sandboxes against a plain Docker or Podman container that requests `--device nvidia.com/gpu=all`. The probe image defaults to the image used by the `gateway` stage in -`deploy/docker/Dockerfile.images`; set `OPENSHELL_E2E_GPU_PROBE_IMAGE` to +`deploy/container/Dockerfile.images`; set `OPENSHELL_E2E_GPU_PROBE_IMAGE` to override it. Per-device checks run only for NVIDIA CDI device IDs reported by the runtime's discovered devices list, so WSL2 hosts that expose only `nvidia.com/gpu=all` skip the index-based cases. diff --git a/architecture/build.md b/architecture/build.md index 200be8b1e..d85c026a0 100644 --- a/architecture/build.md +++ b/architecture/build.md @@ -12,8 +12,7 @@ OpenShell builds these main artifacts: |---|---| | Gateway binary | `crates/openshell-server` | | CLI package and Python SDK | `python/openshell` plus Rust binaries where packaged | -| Gateway container image | `deploy/docker/Dockerfile.gateway` | -| Supervisor container image | `deploy/docker/Dockerfile.supervisor` | +| Gateway and supervisor container images | `deploy/container/Dockerfile.images` | | Helm chart | `deploy/helm/openshell` | | VM driver/runtime assets | `crates/openshell-driver-vm` | | Published docs site | `docs/` rendered by Fern config in `fern/` | @@ -31,12 +30,10 @@ glibc 2.31 floor. ## Container Builds -The Docker image pipeline is a two-step flow: build the Rust binary natively -for the target architecture, then assemble the container image from the -prebuilt binary. The gateway image is built from `deploy/docker/Dockerfile.gateway` -and the supervisor image from `deploy/docker/Dockerfile.supervisor`. Neither -Dockerfile compiles Rust — both copy a staged binary out of -`deploy/docker/.build/prebuilt-binaries//` into the final image. +The container image pipeline stages prebuilt Rust binaries, then builds container +images from `deploy/container/Dockerfile.images`. CI builds native artifacts on the +target architecture, stages them under `deploy/container/.build/`, and then uses +Buildx to publish per-architecture images and multi-architecture tags. Binary staging is driven by `tasks/scripts/stage-prebuilt-binaries.sh`. Gateway binaries use `cargo zigbuild` with GNU targets pinned to glibc 2.31, including @@ -59,7 +56,6 @@ Runtime layout: Static linkage is required because the image is mounted/extracted into sandbox environments (Docker extraction, Podman image volumes, Kubernetes init-container copy-self) and cannot rely on a dynamic loader. - Gateway image builds bake the corresponding supervisor image tag into the gateway binary so Docker sandboxes do not depend on `:latest` by default. Package formulas also pin Docker supervisor extraction to the matching release diff --git a/crates/openshell-driver-podman/README.md b/crates/openshell-driver-podman/README.md index 77b42ba37..a03dcfcbf 100644 --- a/crates/openshell-driver-podman/README.md +++ b/crates/openshell-driver-podman/README.md @@ -86,8 +86,8 @@ sequenceDiagram C->>C: entrypoint: /opt/openshell/bin/openshell-sandbox ``` -The supervisor image from `deploy/docker/Dockerfile.supervisor` copies the static -`openshell-sandbox` binary to `/openshell-sandbox`. +The `supervisor` target in `deploy/container/Dockerfile.images` copies the +`openshell-sandbox` binary to `/openshell-sandbox` in the supervisor image. Mounting that image at `/opt/openshell/bin` makes the binary available as `/opt/openshell/bin/openshell-sandbox`. @@ -346,4 +346,4 @@ matter compared to cluster or rootful runtimes: netns, proxy, and relay behavior shared by all drivers. - Container engine abstraction: `tasks/scripts/container-engine.sh` for build/deploy support across Docker and Podman. -- Supervisor image build: `deploy/docker/Dockerfile.supervisor`. +- Supervisor image build: `deploy/container/Dockerfile.images`. diff --git a/deploy/docker/.dockerignore b/deploy/container/.dockerignore similarity index 100% rename from deploy/docker/.dockerignore rename to deploy/container/.dockerignore diff --git a/deploy/docker/Dockerfile.ci b/deploy/container/Dockerfile.ci similarity index 100% rename from deploy/docker/Dockerfile.ci rename to deploy/container/Dockerfile.ci diff --git a/deploy/docker/Dockerfile.cli-macos b/deploy/container/Dockerfile.cli-macos similarity index 98% rename from deploy/docker/Dockerfile.cli-macos rename to deploy/container/Dockerfile.cli-macos index 7dce3a63d..f4118ecdc 100644 --- a/deploy/docker/Dockerfile.cli-macos +++ b/deploy/container/Dockerfile.cli-macos @@ -8,7 +8,7 @@ # wheel wrapping. # # Usage: -# docker buildx build -f deploy/docker/Dockerfile.cli-macos \ +# docker buildx build -f deploy/container/Dockerfile.cli-macos \ # --build-arg OPENSHELL_CARGO_VERSION=0.6.0 \ # --output type=local,dest=out/ . diff --git a/deploy/docker/Dockerfile.driver-vm-macos b/deploy/container/Dockerfile.driver-vm-macos similarity index 98% rename from deploy/docker/Dockerfile.driver-vm-macos rename to deploy/container/Dockerfile.driver-vm-macos index 58317a52d..06b41127a 100644 --- a/deploy/docker/Dockerfile.driver-vm-macos +++ b/deploy/container/Dockerfile.driver-vm-macos @@ -13,7 +13,7 @@ # include_bytes!(). # # Usage: -# docker buildx build -f deploy/docker/Dockerfile.driver-vm-macos \ +# docker buildx build -f deploy/container/Dockerfile.driver-vm-macos \ # --build-arg OPENSHELL_CARGO_VERSION=0.6.0 \ # --build-context vm-runtime-compressed=/path/to/compressed-dir \ # --output type=local,dest=out/ . diff --git a/deploy/docker/Dockerfile.gateway b/deploy/container/Dockerfile.gateway similarity index 100% rename from deploy/docker/Dockerfile.gateway rename to deploy/container/Dockerfile.gateway diff --git a/deploy/docker/Dockerfile.gateway-macos b/deploy/container/Dockerfile.gateway-macos similarity index 100% rename from deploy/docker/Dockerfile.gateway-macos rename to deploy/container/Dockerfile.gateway-macos diff --git a/deploy/container/Dockerfile.images b/deploy/container/Dockerfile.images new file mode 100644 index 000000000..0ced71b29 --- /dev/null +++ b/deploy/container/Dockerfile.images @@ -0,0 +1,122 @@ +# syntax=docker/dockerfile:1.4 + +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +# Shared OpenShell image build graph. +# +# Targets: +# gateway Final gateway image +# supervisor Final supervisor image (Ubuntu base, supervisor binary) +# +# Rust binaries are built natively before the image build and staged at: +# deploy/container/.build/prebuilt-binaries//openshell-{gateway,sandbox} +# +# For local dev (Skaffold), pass --build-arg BUILD_FROM_SOURCE=1 to compile +# binaries inside Docker instead. BuildKit only executes the selected binary +# staging stage, so missing prebuilt files do not cause a build failure. + +# Controls binary source: 0 = prebuilt (release), 1 = compile in Docker (local dev). +# Must be declared here (global scope) so it can be used in FROM instructions below. +ARG BUILD_FROM_SOURCE=0 + +# --------------------------------------------------------------------------- +# Optional in-Docker Rust build (BUILD_FROM_SOURCE=1, local dev only) +# --------------------------------------------------------------------------- +FROM rust:1.95.0-slim-bookworm AS rust-builder + +RUN apt-get update && apt-get install -y --no-install-recommends \ + build-essential \ + cmake \ + pkg-config \ + libssl-dev \ + ca-certificates \ + && rm -rf /var/lib/apt/lists/* + +WORKDIR /build + +COPY Cargo.toml Cargo.lock ./ +COPY crates/ crates/ +COPY proto/ proto/ +COPY providers/ providers/ + +RUN --mount=type=cache,target=/usr/local/cargo/registry \ + --mount=type=cache,target=/build/target \ + cargo build --release \ + --features "openshell-core/dev-settings" \ + --bin openshell-gateway \ + --bin openshell-sandbox \ + && mkdir -p /build/out \ + && install -m 0755 target/release/openshell-gateway /build/out/openshell-gateway \ + && install -m 0755 target/release/openshell-sandbox /build/out/openshell-sandbox + +# --------------------------------------------------------------------------- +# Per-arch binary stages +# --------------------------------------------------------------------------- + +# Prebuilt path (release default, BUILD_FROM_SOURCE=0) +FROM scratch AS gateway-binary-0 +ARG TARGETARCH +# --chmod=755 preserves the executable bit through actions/upload-artifact + +# download-artifact, which strip exec perms during the roundtrip. +COPY --chmod=755 deploy/container/.build/prebuilt-binaries/${TARGETARCH}/openshell-gateway /build/out/openshell-gateway + +# Source-built path (local dev, BUILD_FROM_SOURCE=1) +FROM rust-builder AS gateway-binary-1 + +FROM gateway-binary-${BUILD_FROM_SOURCE} AS gateway-binary + +# Prebuilt path (release default, BUILD_FROM_SOURCE=0) +FROM scratch AS supervisor-binary-0 +ARG TARGETARCH +# --chmod=755 preserves the executable bit through actions/upload-artifact + +# download-artifact, which strip exec perms during the roundtrip. +COPY --chmod=755 deploy/container/.build/prebuilt-binaries/${TARGETARCH}/openshell-sandbox /build/out/openshell-sandbox + +# Source-built path (local dev, BUILD_FROM_SOURCE=1) +FROM rust-builder AS supervisor-binary-1 + +FROM supervisor-binary-${BUILD_FROM_SOURCE} AS supervisor-binary + +# --------------------------------------------------------------------------- +# Final gateway image +# --------------------------------------------------------------------------- +FROM nvcr.io/nvidia/base/ubuntu:noble-20251013 AS gateway + +RUN apt-get update && apt-get install -y --no-install-recommends \ + ca-certificates && \ + apt-get install -y --only-upgrade gpgv && \ + rm -rf /var/lib/apt/lists/* + +RUN useradd --create-home --user-group openshell + +WORKDIR /app + +COPY --from=gateway-binary /build/out/openshell-gateway /usr/local/bin/ + +RUN mkdir -p /build/crates/openshell-server +COPY --chmod=755 crates/openshell-server/migrations /build/crates/openshell-server/migrations + +USER openshell +EXPOSE 8080 + +ENTRYPOINT ["openshell-gateway"] +CMD ["--bind-address", "0.0.0.0", "--port", "8080"] + +# --------------------------------------------------------------------------- +# Final supervisor image +# --------------------------------------------------------------------------- +# Supervisor image based on the same NVIDIA Ubuntu base used by the gateway. +# +# Used by: +# - Docker driver: binary is extracted from the image and run inside the +# agent container. +# - Podman driver: image is mounted as an OCI volume at /opt/openshell/bin. +# - Kubernetes driver: image runs as an init container that invokes the +# binary's `copy-self` subcommand to seed an emptyDir volume. +# +# An Ubuntu base provides glibc and the dynamic loader needed to exec the +# dynamically linked binary. `FROM scratch` would be smaller but cannot run +# the binary, breaking the Kubernetes init-container path. +FROM nvcr.io/nvidia/base/ubuntu:noble-20251013 AS supervisor +COPY --from=supervisor-binary /build/out/openshell-sandbox /openshell-sandbox diff --git a/deploy/docker/Dockerfile.python-wheels b/deploy/container/Dockerfile.python-wheels similarity index 99% rename from deploy/docker/Dockerfile.python-wheels rename to deploy/container/Dockerfile.python-wheels index e93bf8f22..5d361b23b 100644 --- a/deploy/docker/Dockerfile.python-wheels +++ b/deploy/container/Dockerfile.python-wheels @@ -24,7 +24,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \ RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain 1.95.0 RUN pip install --no-cache-dir maturin -COPY deploy/docker/cross-build.sh /usr/local/bin/ +COPY deploy/container/cross-build.sh /usr/local/bin/ FROM base AS builder diff --git a/deploy/docker/Dockerfile.python-wheels-macos b/deploy/container/Dockerfile.python-wheels-macos similarity index 100% rename from deploy/docker/Dockerfile.python-wheels-macos rename to deploy/container/Dockerfile.python-wheels-macos diff --git a/deploy/docker/Dockerfile.supervisor b/deploy/container/Dockerfile.supervisor similarity index 100% rename from deploy/docker/Dockerfile.supervisor rename to deploy/container/Dockerfile.supervisor diff --git a/deploy/docker/cross-build.sh b/deploy/container/cross-build.sh similarity index 99% rename from deploy/docker/cross-build.sh rename to deploy/container/cross-build.sh index bb4e4eb14..108faaaf7 100755 --- a/deploy/docker/cross-build.sh +++ b/deploy/container/cross-build.sh @@ -6,7 +6,7 @@ # Shared Rust cross-compilation helpers for multi-arch Docker builds. # # Source this script in Dockerfile RUN layers: -# COPY deploy/docker/cross-build.sh /usr/local/bin/ +# COPY deploy/container/cross-build.sh /usr/local/bin/ # RUN . cross-build.sh && install_cross_toolchain && add_rust_target # RUN . cross-build.sh && cargo_cross_build --release -p my-crate # diff --git a/deploy/helm/openshell/skaffold.yaml b/deploy/helm/openshell/skaffold.yaml index 0e91db505..750e4c8cb 100644 --- a/deploy/helm/openshell/skaffold.yaml +++ b/deploy/helm/openshell/skaffold.yaml @@ -1,7 +1,7 @@ # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 -# Local dev: builds gateway + supervisor images via tasks/scripts/docker-build-image.sh, +# Local dev: builds gateway + supervisor images via tasks/scripts/container-build-image.sh, # which first stages Rust binaries natively on the host (using cargo / cargo-zigbuild # when cross-compiling) and then builds the image from the prebuilt binary. This # mirrors CI and is faster than compiling inside Docker on every rebuild because @@ -28,15 +28,16 @@ build: buildCommand: | IMAGE_NAME="${IMAGE%:*}" \ IMAGE_TAG="${IMAGE##*:}" \ - tasks/scripts/docker-build-image.sh gateway + tasks/scripts/container-build-image.sh gateway dependencies: paths: - Cargo.toml - Cargo.lock - crates/** - proto/** - - deploy/docker/Dockerfile.gateway - - tasks/scripts/docker-build-image.sh + - deploy/container/Dockerfile.images + - crates/openshell-server/migrations/** + - tasks/scripts/container-build-image.sh - tasks/scripts/stage-prebuilt-binaries.sh - image: openshell/supervisor context: ../../.. @@ -44,15 +45,15 @@ build: buildCommand: | IMAGE_NAME="${IMAGE%:*}" \ IMAGE_TAG="${IMAGE##*:}" \ - tasks/scripts/docker-build-image.sh supervisor + tasks/scripts/container-build-image.sh supervisor dependencies: paths: - Cargo.toml - Cargo.lock - crates/** - proto/** - - deploy/docker/Dockerfile.supervisor - - tasks/scripts/docker-build-image.sh + - deploy/container/Dockerfile.images + - tasks/scripts/container-build-image.sh - tasks/scripts/stage-prebuilt-binaries.sh deploy: helm: diff --git a/e2e/with-docker-gateway.sh b/e2e/with-docker-gateway.sh index 2ef5495b8..4b92154d7 100755 --- a/e2e/with-docker-gateway.sh +++ b/e2e/with-docker-gateway.sh @@ -373,7 +373,7 @@ else CONTAINER_ENGINE=docker \ DOCKER_PLATFORM="linux/${DAEMON_ARCH}" \ DOCKER_OUTPUT="type=local,dest=${SUPERVISOR_OUT_DIR}" \ - bash "${ROOT}/tasks/scripts/docker-build-image.sh" supervisor-output + bash "${ROOT}/tasks/scripts/container-build-image.sh" supervisor-output fi if [ ! -f "${SUPERVISOR_BIN}" ]; then diff --git a/e2e/with-podman-gateway.sh b/e2e/with-podman-gateway.sh index 3fea3e53a..4b3704b8c 100755 --- a/e2e/with-podman-gateway.sh +++ b/e2e/with-podman-gateway.sh @@ -254,7 +254,7 @@ ensure_podman_supervisor_image() { && [ -z "${CI:-}" ]; then echo "Building local Podman supervisor image ${image}..." with_podman_config env CONTAINER_ENGINE=podman IMAGE_TAG=dev \ - bash "${ROOT}/tasks/scripts/docker-build-image.sh" supervisor + bash "${ROOT}/tasks/scripts/container-build-image.sh" supervisor if podman_cmd image exists "${image}" 2>/dev/null; then return 0 fi diff --git a/scripts/docker-cleanup.sh b/scripts/container-cleanup.sh similarity index 98% rename from scripts/docker-cleanup.sh rename to scripts/container-cleanup.sh index d847e9b0d..344996e67 100755 --- a/scripts/docker-cleanup.sh +++ b/scripts/container-cleanup.sh @@ -3,7 +3,7 @@ # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 -# Clean up stale Docker images, volumes, and build cache that are not in use +# Clean up stale container images, volumes, and build cache that are not in use # by the currently deployed OpenShell cluster. # # Preserves: @@ -13,7 +13,7 @@ # - Volumes attached to running containers # # Usage: -# ./scripts/docker-cleanup.sh [options] +# ./scripts/container-cleanup.sh [options] # # Options: # --dry-run Show what would be removed without deleting anything diff --git a/tasks/ci.toml b/tasks/ci.toml index 9696517af..7026da342 100644 --- a/tasks/ci.toml +++ b/tasks/ci.toml @@ -5,7 +5,7 @@ [build] description = "Build the whole project" -depends = ["build:rust:workspace", "build:docker", "build:python:wheel"] +depends = ["build:rust:workspace", "build:container", "build:python:wheel"] ["build:rust"] description = "Alias for build:rust:workspace" diff --git a/tasks/container.toml b/tasks/container.toml new file mode 100644 index 000000000..86aaebd3a --- /dev/null +++ b/tasks/container.toml @@ -0,0 +1,54 @@ +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +# Container image build tasks (engine-neutral: Docker or Podman) +# build:docker:* and docker:* names are kept as aliases for backwards compatibility. + +["build:container"] +description = "Build all container images" +alias = "build:docker" +depends = [ + "build:container:gateway", + "build:container:supervisor", +] +hide = true + +["build:container:ci"] +description = "Build the CI container image" +alias = "build:docker:ci" +run = "tasks/scripts/container-build-ci.sh" +hide = true + +["build:container:gateway"] +description = "Build the gateway container image" +alias = ["build:docker:gateway", "docker:build:gateway"] +run = "tasks/scripts/container-build-image.sh gateway" +hide = true + +["build:container:supervisor"] +description = "Build the standalone supervisor container image (Ubuntu-based, for K8s pods)" +alias = ["build:docker:supervisor", "docker:build:supervisor"] +run = "tasks/scripts/container-build-image.sh supervisor" +hide = true + +["build:container:multiarch"] +description = "Build multi-arch gateway and supervisor images and push to a registry" +alias = ["build:docker:multiarch", "docker:build:multiarch"] +run = "tasks/scripts/container-publish-multiarch.sh" +hide = true + +["build:container:prebuilt"] +description = "Build and stage Rust binaries consumed by container image builds" +alias = "build:docker:prebuilt" +run = "tasks/scripts/stage-prebuilt-binaries.sh all" +hide = true + +["container:cleanup"] +description = "Remove stale images, volumes, and build cache not used by the current deployments" +alias = "docker:cleanup" +run = "scripts/container-cleanup.sh --force" + +["container:cleanup:dry-run"] +description = "Preview what container:cleanup would remove without deleting anything" +alias = "docker:cleanup:dry-run" +run = "scripts/container-cleanup.sh --dry-run" diff --git a/tasks/docker.toml b/tasks/docker.toml deleted file mode 100644 index 502b2363c..000000000 --- a/tasks/docker.toml +++ /dev/null @@ -1,60 +0,0 @@ -# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 - -# Docker image build tasks - -["build:docker"] -description = "Build all Docker images" -depends = [ - "build:docker:gateway", - "build:docker:supervisor", -] -hide = true - -["build:docker:ci"] -description = "Build the CI Docker image" -run = "tasks/scripts/docker-build-ci.sh" -hide = true - -["build:docker:prebuilt"] -description = "Build and stage Rust binaries consumed by Docker image builds" -run = "tasks/scripts/stage-prebuilt-binaries.sh all" -hide = true - -["build:docker:gateway"] -description = "Build the gateway Docker image" -run = "tasks/scripts/docker-build-image.sh gateway" -hide = true - -["build:docker:supervisor"] -description = "Build the supervisor image" -run = "tasks/scripts/docker-build-image.sh supervisor" -hide = true - -["docker:build:gateway"] -description = "Alias for build:docker:gateway" -depends = ["build:docker:gateway"] -hide = true - -["docker:build:supervisor"] -description = "Alias for build:docker:supervisor" -depends = ["build:docker:supervisor"] -hide = true - -["build:docker:multiarch"] -description = "Build multi-arch gateway and supervisor images and push to a registry" -run = "tasks/scripts/docker-publish-multiarch.sh" -hide = true - -["docker:build:multiarch"] -description = "Alias for build:docker:multiarch" -depends = ["build:docker:multiarch"] -hide = true - -["docker:cleanup"] -description = "Remove stale images, volumes, and build cache not used by current deployments" -run = "scripts/docker-cleanup.sh --force" - -["docker:cleanup:dry-run"] -description = "Preview what docker:cleanup would remove without deleting anything" -run = "scripts/docker-cleanup.sh --dry-run" diff --git a/tasks/python.toml b/tasks/python.toml index b95d96671..439a2baa3 100644 --- a/tasks/python.toml +++ b/tasks/python.toml @@ -174,8 +174,10 @@ CARGO_TARGET_CACHE_SCOPE=$(printf '%s' "$CACHE_SCOPE_INPUT" | sha256_16_stdin) mkdir -p target/wheels +CONTAINERFILE=$(ce_resolve_containerfile deploy/container python-wheels-macos) + ce build \ - -f deploy/docker/Dockerfile.python-wheels-macos \ + -f "${CONTAINERFILE}" \ --target wheels \ --build-arg "OSXCROSS_IMAGE=${OSXCROSS_IMAGE_REF}" \ --build-arg "CARGO_TARGET_CACHE_SCOPE=${CARGO_TARGET_CACHE_SCOPE}" \ diff --git a/tasks/scripts/docker-build-ci.sh b/tasks/scripts/container-build-ci.sh similarity index 55% rename from tasks/scripts/docker-build-ci.sh rename to tasks/scripts/container-build-ci.sh index 56ff7148f..337a07ceb 100755 --- a/tasks/scripts/docker-build-ci.sh +++ b/tasks/scripts/container-build-ci.sh @@ -3,7 +3,7 @@ # SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0 -# Build the CI Docker image (deploy/docker/Dockerfile.ci). +# Build the CI container image (deploy/container/Dockerfile.ci or Containerfile.ci). # This is a standalone build, separate from the main image build graph. set -euo pipefail @@ -11,10 +11,17 @@ set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" source "${SCRIPT_DIR}/container-engine.sh" +# Backwards-compatible env var fallbacks: accept CONTAINER_* or DOCKER_* +CONTAINER_BUILDER="${CONTAINER_BUILDER:-${DOCKER_BUILDER:-}}" +CONTAINER_PLATFORM="${CONTAINER_PLATFORM:-${DOCKER_PLATFORM:-}}" +CONTAINER_PUSH="${CONTAINER_PUSH:-${DOCKER_PUSH:-}}" + +CONTAINERFILE=$(ce_resolve_containerfile deploy/container ci) + OUTPUT_ARGS=(--load) -if [[ "${DOCKER_PUSH:-}" == "1" ]]; then +if [[ "${CONTAINER_PUSH}" == "1" ]]; then OUTPUT_ARGS=(--push) -elif [[ "${DOCKER_PLATFORM:-}" == *","* ]]; then +elif [[ "${CONTAINER_PLATFORM}" == *","* ]]; then OUTPUT_ARGS=(--push) fi @@ -25,11 +32,11 @@ elif [[ -n "${GITHUB_TOKEN:-}" ]]; then SECRET_ARGS=(--secret id=MISE_GITHUB_TOKEN,env=GITHUB_TOKEN) fi -exec ce_build \ - ${DOCKER_BUILDER:+--builder ${DOCKER_BUILDER}} \ - ${DOCKER_PLATFORM:+--platform ${DOCKER_PLATFORM}} \ +ce_build \ + ${CONTAINER_BUILDER:+--builder ${CONTAINER_BUILDER}} \ + ${CONTAINER_PLATFORM:+--platform ${CONTAINER_PLATFORM}} \ ${SECRET_ARGS[@]+"${SECRET_ARGS[@]}"} \ - -f deploy/docker/Dockerfile.ci \ + -f "${CONTAINERFILE}" \ -t "openshell/ci:${IMAGE_TAG:-dev}" \ --provenance=false \ "$@" \ diff --git a/tasks/scripts/container-build-image.sh b/tasks/scripts/container-build-image.sh new file mode 100755 index 000000000..a3360bdd0 --- /dev/null +++ b/tasks/scripts/container-build-image.sh @@ -0,0 +1,207 @@ +#!/usr/bin/env bash + +# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +source "${SCRIPT_DIR}/container-engine.sh" + +# Backwards-compatible env var fallbacks: accept CONTAINER_* or DOCKER_* +CONTAINER_BUILD_CACHE_DIR="${CONTAINER_BUILD_CACHE_DIR:-${DOCKER_BUILD_CACHE_DIR:-.cache/buildkit}}" +CONTAINER_BUILDER="${CONTAINER_BUILDER:-${DOCKER_BUILDER:-}}" +CONTAINER_PLATFORM="${CONTAINER_PLATFORM:-${DOCKER_PLATFORM:-}}" +CONTAINER_OUTPUT="${CONTAINER_OUTPUT:-${DOCKER_OUTPUT:-}}" +CONTAINER_PUSH="${CONTAINER_PUSH:-${DOCKER_PUSH:-}}" + +sha256_16() { + if command -v sha256sum >/dev/null 2>&1; then + sha256sum "$1" | awk '{print substr($1, 1, 16)}' + else + shasum -a 256 "$1" | awk '{print substr($1, 1, 16)}' + fi +} + +sha256_16_stdin() { + if command -v sha256sum >/dev/null 2>&1; then + sha256sum | awk '{print substr($1, 1, 16)}' + else + shasum -a 256 | awk '{print substr($1, 1, 16)}' + fi +} + +detect_rust_scope() { + local dockerfile="$1" + local rust_from + rust_from=$(grep -E '^FROM --platform=\$BUILDPLATFORM rust:[^ ]+' "$dockerfile" | head -n1 | sed -E 's/^FROM --platform=\$BUILDPLATFORM rust:([^ ]+).*/\1/' || true) + if [[ -n "${rust_from}" ]]; then + echo "rust-${rust_from}" + return + fi + + if grep -q "rustup.rs" "$dockerfile"; then + echo "rustup-stable" + return + fi + + echo "no-rust" +} + +TARGET=${1:?"Usage: container-build-image.sh [extra-args...]"} +shift + +CONTAINERFILE=$(ce_resolve_containerfile deploy/container images) + +IS_FINAL_IMAGE=0 +IMAGE_NAME="" +BUILD_TARGET="" +case "${TARGET}" in + gateway) + IS_FINAL_IMAGE=1 + IMAGE_NAME="openshell/gateway" + BUILD_TARGET="gateway" + ;; + supervisor) + IS_FINAL_IMAGE=1 + IMAGE_NAME="openshell/supervisor" + BUILD_TARGET="supervisor" + ;; + supervisor-builder) + BUILD_TARGET="supervisor-builder" + ;; + supervisor-output) + # Backward-compat alias: same as "supervisor". + IS_FINAL_IMAGE=1 + IMAGE_NAME="openshell/supervisor" + BUILD_TARGET="supervisor" + ;; + *) + echo "Error: unsupported target '${TARGET}'" >&2 + exit 1 + ;; +esac + +if [[ -n "${IMAGE_REGISTRY:-}" && "${IS_FINAL_IMAGE}" == "1" ]]; then + IMAGE_NAME="${IMAGE_REGISTRY}/${IMAGE_NAME#openshell/}" +fi + +IMAGE_TAG=${IMAGE_TAG:-dev} +CACHE_PATH="${CONTAINER_BUILD_CACHE_DIR}/images" +mkdir -p "${CACHE_PATH}" + +BUILDER_ARGS=() +if ce_is_docker; then + if [[ -n "${CONTAINER_BUILDER}" ]]; then + BUILDER_ARGS=(--builder "${CONTAINER_BUILDER}") + elif [[ -z "${CONTAINER_PLATFORM}" && -z "${CI:-}" ]]; then + _ctx=$(ce_context_name) + BUILDER_ARGS=(--builder "${_ctx}") + fi +fi + +CACHE_ARGS=() +if [[ -z "${CI:-}" ]]; then + if ce_is_docker; then + if ce_buildx_inspect ${BUILDER_ARGS[@]+"${BUILDER_ARGS[@]}"} 2>/dev/null | grep -q "Driver: docker-container"; then + CACHE_ARGS=( + --cache-from "type=local,src=${CACHE_PATH}" + --cache-to "type=local,dest=${CACHE_PATH},mode=max" + ) + fi + fi +fi + +SCCACHE_ARGS=() +if [[ -n "${SCCACHE_MEMCACHED_ENDPOINT:-}" ]]; then + SCCACHE_ARGS=(--build-arg "SCCACHE_MEMCACHED_ENDPOINT=${SCCACHE_MEMCACHED_ENDPOINT}") +fi + +VERSION_ARGS=() +if [[ -n "${OPENSHELL_CARGO_VERSION:-}" ]]; then + VERSION_ARGS=(--build-arg "OPENSHELL_CARGO_VERSION=${OPENSHELL_CARGO_VERSION}") +elif [[ -n "${CI:-}" ]]; then + CARGO_VERSION=$(uv run python tasks/scripts/release.py get-version --cargo 2>/dev/null || true) + if [[ -n "${CARGO_VERSION}" ]]; then + VERSION_ARGS=(--build-arg "OPENSHELL_CARGO_VERSION=${CARGO_VERSION}") + fi +fi + +LOCK_HASH=$(sha256_16 Cargo.lock) +RUST_SCOPE=${RUST_TOOLCHAIN_SCOPE:-$(detect_rust_scope "${CONTAINERFILE}")} +CACHE_SCOPE_INPUT="v2|shared|release|${LOCK_HASH}|${RUST_SCOPE}" +CARGO_TARGET_CACHE_SCOPE=$(printf '%s' "${CACHE_SCOPE_INPUT}" | sha256_16_stdin) + +# CI builds use codegen-units=1 for maximum optimization; local builds omit +# the arg so cargo uses the Cargo.toml default (parallel codegen, fast links). +CODEGEN_ARGS=() +if [[ -n "${CI:-}" ]]; then + CODEGEN_ARGS=(--build-arg "CARGO_CODEGEN_UNITS=1") +fi + +# OS-128 Phase 4: opt in to consuming pre-built Rust binaries instead of +# compiling inside Docker. Default path (`build`) is unchanged. When +# USE_PREBUILT_BINARIES=true, the Dockerfile's BINARY_SOURCE=prebuilt stages +# are selected, which COPY from deploy/container/.build/prebuilt-binaries// +# in the build context. Callers must stage the binaries before invoking. +BINARY_SOURCE_ARGS=() +if [[ "${USE_PREBUILT_BINARIES:-}" == "true" ]]; then + case "${TARGET}" in + gateway|supervisor|supervisor-output) + if [[ ! -d deploy/container/.build/prebuilt-binaries ]]; then + echo "Error: USE_PREBUILT_BINARIES=true but deploy/container/.build/prebuilt-binaries/ does not exist" >&2 + echo " Stage binaries at deploy/container/.build/prebuilt-binaries//openshell-{gateway,sandbox}" >&2 + exit 1 + fi + BINARY_SOURCE_ARGS=(--build-arg "BINARY_SOURCE=prebuilt") + ;; + esac +fi + +TAG_ARGS=() +if [[ "${IS_FINAL_IMAGE}" == "1" ]]; then + TAG_ARGS=(-t "${IMAGE_NAME}:${IMAGE_TAG}") +fi + +OUTPUT_ARGS=() +if [[ -n "${CONTAINER_OUTPUT}" ]]; then + OUTPUT_ARGS=(--output "${CONTAINER_OUTPUT}") +elif [[ "${IS_FINAL_IMAGE}" == "1" ]]; then + if [[ "${CONTAINER_PUSH}" == "1" ]]; then + OUTPUT_ARGS=(--push) + elif [[ "${CONTAINER_PLATFORM}" == *","* ]]; then + OUTPUT_ARGS=(--push) + else + OUTPUT_ARGS=(--load) + fi +else + echo "Error: CONTAINER_OUTPUT must be set when building target '${TARGET}'" >&2 + exit 1 +fi + +# Default to dev-settings so local builds include test-only settings +# (dummy_bool, dummy_int) that e2e tests depend on, matching CI behaviour. +EXTRA_CARGO_FEATURES="${EXTRA_CARGO_FEATURES:-openshell-core/dev-settings}" + +FEATURE_ARGS=() +if [[ -n "${EXTRA_CARGO_FEATURES}" ]]; then + FEATURE_ARGS=(--build-arg "EXTRA_CARGO_FEATURES=${EXTRA_CARGO_FEATURES}") +fi + +ce_build \ + ${BUILDER_ARGS[@]+"${BUILDER_ARGS[@]}"} \ + ${CONTAINER_PLATFORM:+--platform ${CONTAINER_PLATFORM}} \ + ${CACHE_ARGS[@]+"${CACHE_ARGS[@]}"} \ + ${SCCACHE_ARGS[@]+"${SCCACHE_ARGS[@]}"} \ + ${VERSION_ARGS[@]+"${VERSION_ARGS[@]}"} \ + ${CODEGEN_ARGS[@]+"${CODEGEN_ARGS[@]}"} \ + ${BINARY_SOURCE_ARGS[@]+"${BINARY_SOURCE_ARGS[@]}"} \ + ${FEATURE_ARGS[@]+"${FEATURE_ARGS[@]}"} \ + --build-arg "CARGO_TARGET_CACHE_SCOPE=${CARGO_TARGET_CACHE_SCOPE}" \ + -f "${CONTAINERFILE}" \ + --target "${BUILD_TARGET}" \ + ${TAG_ARGS[@]+"${TAG_ARGS[@]}"} \ + --provenance=false \ + "$@" \ + ${OUTPUT_ARGS[@]+"${OUTPUT_ARGS[@]}"} \ + . diff --git a/tasks/scripts/container-engine.sh b/tasks/scripts/container-engine.sh index 64c644d0b..c9ff0417c 100755 --- a/tasks/scripts/container-engine.sh +++ b/tasks/scripts/container-engine.sh @@ -300,7 +300,7 @@ ce_imagetools_create() { # Podman fallback: parse -t and the trailing source image, then # use skopeo or podman tag. This is a best-effort shim for simple # re-tagging; full multi-arch manifest manipulation should use the - # podman-native code path in docker-publish-multiarch.sh. + # podman-native code path in container-publish-multiarch.sh. # # Argument parsing uses a sentinel ("__next__") to capture the value # that follows a two-token -t / --tag flag. --prefer-index is accepted @@ -335,6 +335,32 @@ ce_imagetools_create() { fi } +# --------------------------------------------------------------------------- +# ce_resolve_containerfile — find the container build file for a given target. +# +# Probes for Containerfile.{suffix} first (Podman/OCI convention), then +# Dockerfile.{suffix} (Docker convention). Returns the first match. +# +# Usage: ce_resolve_containerfile +# dir — directory containing the build file (e.g. deploy/container) +# suffix — file suffix (e.g. images, ci, python-wheels-macos) +# +# Prints the resolved path on stdout. Returns 1 if neither file exists. +# --------------------------------------------------------------------------- +ce_resolve_containerfile() { + local dir="${1:?Usage: ce_resolve_containerfile }" + local suffix="${2:?Usage: ce_resolve_containerfile }" + + if [[ -f "${dir}/Containerfile.${suffix}" ]]; then + echo "${dir}/Containerfile.${suffix}" + elif [[ -f "${dir}/Dockerfile.${suffix}" ]]; then + echo "${dir}/Dockerfile.${suffix}" + else + echo "Error: no Containerfile.${suffix} or Dockerfile.${suffix} found in ${dir}" >&2 + return 1 + fi +} + # --------------------------------------------------------------------------- # Log the detected engine so developers always know which tool is active. # Emitted once per script invocation (the double-source guard at the top diff --git a/tasks/scripts/docker-publish-multiarch.sh b/tasks/scripts/container-publish-multiarch.sh similarity index 79% rename from tasks/scripts/docker-publish-multiarch.sh rename to tasks/scripts/container-publish-multiarch.sh index 2f23c5106..4ec2dc400 100755 --- a/tasks/scripts/docker-publish-multiarch.sh +++ b/tasks/scripts/container-publish-multiarch.sh @@ -4,23 +4,29 @@ # SPDX-License-Identifier: Apache-2.0 # Build multi-arch gateway + supervisor images and push to a container registry. -# Requires DOCKER_REGISTRY to be set (e.g. ghcr.io/myorg). +# Requires CONTAINER_REGISTRY (or DOCKER_REGISTRY) to be set (e.g. ghcr.io/myorg). set -euo pipefail SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" source "${SCRIPT_DIR}/container-engine.sh" -REGISTRY=${DOCKER_REGISTRY:?Set DOCKER_REGISTRY to push multi-arch images (e.g. ghcr.io/myorg)} +# Backwards-compatible env var fallbacks: accept CONTAINER_* or DOCKER_* +CONTAINER_REGISTRY="${CONTAINER_REGISTRY:-${DOCKER_REGISTRY:-}}" +CONTAINER_PLATFORMS="${CONTAINER_PLATFORMS:-${DOCKER_PLATFORMS:-linux/amd64,linux/arm64}}" +CONTAINER_BUILDER="${CONTAINER_BUILDER:-${DOCKER_BUILDER:-}}" +CONTAINER_PUSH="${CONTAINER_PUSH:-${DOCKER_PUSH:-}}" +EXTRA_CONTAINER_TAGS="${EXTRA_CONTAINER_TAGS:-${EXTRA_DOCKER_TAGS:-}}" + +REGISTRY=${CONTAINER_REGISTRY:?Set CONTAINER_REGISTRY (or DOCKER_REGISTRY) to push multi-arch images (e.g. ghcr.io/myorg)} IMAGE_TAG=${IMAGE_TAG:-dev} -PLATFORMS=${DOCKER_PLATFORMS:-linux/amd64,linux/arm64} +PLATFORMS=${CONTAINER_PLATFORMS} TAG_LATEST=${TAG_LATEST:-false} -EXTRA_DOCKER_TAGS_RAW=${EXTRA_DOCKER_TAGS:-} EXTRA_TAGS=() -if [[ -n "${EXTRA_DOCKER_TAGS_RAW}" ]]; then - EXTRA_DOCKER_TAGS_RAW=${EXTRA_DOCKER_TAGS_RAW//,/ } - for tag in ${EXTRA_DOCKER_TAGS_RAW}; do +if [[ -n "${EXTRA_CONTAINER_TAGS}" ]]; then + EXTRA_TAGS_RAW=${EXTRA_CONTAINER_TAGS//,/ } + for tag in ${EXTRA_TAGS_RAW}; do [[ -n "${tag}" ]] && EXTRA_TAGS+=("${tag}") done fi @@ -29,7 +35,7 @@ fi # Docker path: use buildx builders + imagetools for multi-arch # --------------------------------------------------------------------------- _publish_multiarch_docker() { - BUILDER_NAME=${DOCKER_BUILDER:-multiarch} + BUILDER_NAME=${CONTAINER_BUILDER:-multiarch} if ce buildx inspect "${BUILDER_NAME}" >/dev/null 2>&1; then echo "Using existing buildx builder: ${BUILDER_NAME}" ce buildx use "${BUILDER_NAME}" @@ -38,17 +44,17 @@ _publish_multiarch_docker() { ce buildx create --name "${BUILDER_NAME}" --use --bootstrap fi - export DOCKER_BUILDER="${BUILDER_NAME}" - export DOCKER_PLATFORM="${PLATFORMS}" - export DOCKER_PUSH=1 + export CONTAINER_BUILDER="${BUILDER_NAME}" + export CONTAINER_PLATFORM="${PLATFORMS}" + export CONTAINER_PUSH=1 export IMAGE_REGISTRY="${REGISTRY}" echo "Building multi-arch gateway image..." - tasks/scripts/docker-build-image.sh gateway + tasks/scripts/container-build-image.sh gateway echo echo "Building multi-arch supervisor image..." - tasks/scripts/docker-build-image.sh supervisor + tasks/scripts/container-build-image.sh supervisor TAGS_TO_APPLY=("${EXTRA_TAGS[@]}") if [[ "${TAG_LATEST}" == "true" ]]; then @@ -91,12 +97,12 @@ _publish_multiarch_podman() { for platform in "${PLATFORM_LIST[@]}"; do echo " Building ${component} for ${platform}..." # Build for each platform and add to the manifest list. - # docker-build-image.sh sources container-engine.sh itself, + # container-build-image.sh sources container-engine.sh itself, # so ce_build is used internally. - DOCKER_PLATFORM="${platform}" \ - DOCKER_PUSH="" \ + CONTAINER_PLATFORM="${platform}" \ + CONTAINER_PUSH="" \ IMAGE_TAG="${IMAGE_TAG}" \ - tasks/scripts/docker-build-image.sh "${component}" + tasks/scripts/container-build-image.sh "${component}" # Tag with a platform-specific suffix for manifest assembly. local platform_tag="${IMAGE_TAG}-${platform//\//-}" diff --git a/tasks/scripts/docker-build-image.sh b/tasks/scripts/docker-build-image.sh deleted file mode 100755 index 8570180f1..000000000 --- a/tasks/scripts/docker-build-image.sh +++ /dev/null @@ -1,197 +0,0 @@ -#!/usr/bin/env bash - -# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 - -set -euo pipefail - -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -source "${SCRIPT_DIR}/container-engine.sh" - -normalize_arch() { - case "$1" in - x86_64|amd64) echo "amd64" ;; - aarch64|arm64) echo "arm64" ;; - *) echo "$1" ;; - esac -} - -prebuilt_arches() { - if [[ -n "${DOCKER_PLATFORM:-}" ]]; then - local raw_platforms=${DOCKER_PLATFORM//[[:space:]]/} - local platform - IFS=',' read -r -a platforms <<< "${raw_platforms}" - for platform in "${platforms[@]}"; do - case "${platform}" in - linux/amd64) echo "amd64" ;; - linux/arm64) echo "arm64" ;; - *) - echo "Error: unsupported DOCKER_PLATFORM '${platform}'" >&2 - echo "Supported platforms: linux/amd64, linux/arm64" >&2 - exit 1 - ;; - esac - done - return - fi - - normalize_arch "$(ce_info_arch)" -} - -required_prebuilt_binaries() { - case "$1" in - gateway) - echo "openshell-gateway" - ;; - supervisor|supervisor-sideload|supervisor-output) - echo "openshell-sandbox" - ;; - esac -} - -missing_prebuilt_paths() { - local target=$1 - local arch - local binary - local path - - local arches=() - while IFS= read -r _a; do arches+=("$_a"); done < <(prebuilt_arches) - read -r -a binaries <<< "$(required_prebuilt_binaries "${target}")" - - for arch in "${arches[@]}"; do - for binary in "${binaries[@]}"; do - path="deploy/docker/.build/prebuilt-binaries/${arch}/${binary}" - if [[ ! -f "${path}" ]]; then - echo "${path}" - fi - done - done -} - -ensure_prebuilt_binaries() { - local target=$1 - local missing - local arch - - if [[ -z "${CI:-}" && "${PREBUILT_AUTO_STAGE:-1}" != "0" ]]; then - echo "Staging prebuilt Rust binaries for Docker target '${target}'..." - local arches=() - while IFS= read -r _a; do arches+=("$_a"); done < <(prebuilt_arches) - for arch in "${arches[@]}"; do - PREBUILT_ARCH="${arch}" "${SCRIPT_DIR}/stage-prebuilt-binaries.sh" "${target}" - done - fi - - missing="$(missing_prebuilt_paths "${target}")" - if [[ -n "${missing}" ]]; then - echo "Error: missing prebuilt Rust binaries required by Docker target '${target}':" >&2 - printf ' %s\n' ${missing} >&2 - echo "Stage binaries at deploy/docker/.build/prebuilt-binaries// before building." >&2 - exit 1 - fi -} - -TARGET=${1:?"Usage: docker-build-image.sh [extra-args...]"} -shift - -IS_FINAL_IMAGE=0 -IMAGE_NAME="" -DOCKER_TARGET="" -DOCKERFILE="" -case "${TARGET}" in - gateway) - IS_FINAL_IMAGE=1 - IMAGE_NAME="openshell/gateway" - DOCKER_TARGET="gateway" - DOCKERFILE="deploy/docker/Dockerfile.gateway" - ;; - supervisor) - IS_FINAL_IMAGE=1 - IMAGE_NAME="openshell/supervisor" - DOCKER_TARGET="supervisor" - DOCKERFILE="deploy/docker/Dockerfile.supervisor" - ;; - supervisor-output) - # Backward-compat alias: same as "supervisor". - IS_FINAL_IMAGE=1 - IMAGE_NAME="openshell/supervisor" - DOCKER_TARGET="supervisor" - DOCKERFILE="deploy/docker/Dockerfile.supervisor" - ;; - *) - echo "Error: unsupported target '${TARGET}'" >&2 - exit 1 - ;; -esac - -if [[ ! -f "${DOCKERFILE}" ]]; then - echo "Error: Dockerfile not found: ${DOCKERFILE}" >&2 - exit 1 -fi - -if [[ -n "${IMAGE_REGISTRY:-}" && "${IS_FINAL_IMAGE}" == "1" ]]; then - IMAGE_NAME="${IMAGE_REGISTRY}/${IMAGE_NAME#openshell/}" -fi - -IMAGE_TAG=${IMAGE_TAG:-dev} -DOCKER_BUILD_CACHE_DIR=${DOCKER_BUILD_CACHE_DIR:-.cache/buildkit} -CACHE_PATH="${DOCKER_BUILD_CACHE_DIR}/images" -mkdir -p "${CACHE_PATH}" - -BUILDER_ARGS=() -if ce_is_docker; then - if [[ -n "${DOCKER_BUILDER:-}" ]]; then - BUILDER_ARGS=(--builder "${DOCKER_BUILDER}") - elif [[ -z "${DOCKER_PLATFORM:-}" && -z "${CI:-}" ]]; then - _ctx=$(ce_context_name) - BUILDER_ARGS=(--builder "${_ctx}") - fi -fi - -CACHE_ARGS=() -if [[ -z "${CI:-}" ]]; then - if ce_is_docker; then - if ce_buildx_inspect ${BUILDER_ARGS[@]+"${BUILDER_ARGS[@]}"} 2>/dev/null | grep -q "Driver: docker-container"; then - CACHE_ARGS=( - --cache-from "type=local,src=${CACHE_PATH}" - --cache-to "type=local,dest=${CACHE_PATH},mode=max" - ) - fi - fi -fi - -ensure_prebuilt_binaries "${TARGET}" - -TAG_ARGS=() -if [[ "${IS_FINAL_IMAGE}" == "1" ]]; then - TAG_ARGS=(-t "${IMAGE_NAME}:${IMAGE_TAG}") -fi - -OUTPUT_ARGS=() -if [[ -n "${DOCKER_OUTPUT:-}" ]]; then - OUTPUT_ARGS=(--output "${DOCKER_OUTPUT}") -elif [[ "${IS_FINAL_IMAGE}" == "1" ]]; then - if [[ "${DOCKER_PUSH:-}" == "1" ]]; then - OUTPUT_ARGS=(--push) - elif [[ "${DOCKER_PLATFORM:-}" == *","* ]]; then - OUTPUT_ARGS=(--push) - else - OUTPUT_ARGS=(--load) - fi -else - echo "Error: DOCKER_OUTPUT must be set when building target '${TARGET}'" >&2 - exit 1 -fi - -ce_build \ - ${BUILDER_ARGS[@]+"${BUILDER_ARGS[@]}"} \ - ${DOCKER_PLATFORM:+--platform ${DOCKER_PLATFORM}} \ - ${CACHE_ARGS[@]+"${CACHE_ARGS[@]}"} \ - -f "${DOCKERFILE}" \ - --target "${DOCKER_TARGET}" \ - ${TAG_ARGS[@]+"${TAG_ARGS[@]}"} \ - --provenance=false \ - "$@" \ - ${OUTPUT_ARGS[@]+"${OUTPUT_ARGS[@]}"} \ - . diff --git a/tasks/scripts/gateway-docker.sh b/tasks/scripts/gateway-docker.sh index 0c7148969..4f93393f0 100644 --- a/tasks/scripts/gateway-docker.sh +++ b/tasks/scripts/gateway-docker.sh @@ -154,7 +154,7 @@ else CONTAINER_ENGINE=docker \ DOCKER_PLATFORM="linux/${DAEMON_ARCH}" \ DOCKER_OUTPUT="type=local,dest=${SUPERVISOR_OUT_DIR}" \ - bash "${ROOT}/tasks/scripts/docker-build-image.sh" supervisor-output + bash "${ROOT}/tasks/scripts/container-build-image.sh" supervisor-output fi if [[ ! -f "${SUPERVISOR_BIN}" ]]; then diff --git a/tasks/scripts/gateway.sh b/tasks/scripts/gateway.sh index 9c31295d6..f3072c639 100644 --- a/tasks/scripts/gateway.sh +++ b/tasks/scripts/gateway.sh @@ -38,7 +38,7 @@ Environment: OPENSHELL_SERVER_PORT Gateway port. Defaults to 8080 for Kubernetes, 18080 for Podman/Docker, and 18081 for VM. OPENSHELL_SUPERVISOR_IMAGE - Podman supervisor sideload image. Defaults to + Podman supervisor image. Defaults to openshell/supervisor:dev and is built on demand. Docker and VM runs delegate to gateway:docker and gateway:vm setup scripts. @@ -175,9 +175,9 @@ ensure_podman_supervisor_image() { exit 1 fi - echo "Building Podman supervisor sideload image (${supervisor_image})..." + echo "Building Podman supervisor image (${supervisor_image})..." require_mise - CONTAINER_ENGINE=podman IMAGE_TAG=dev mise run build:docker:supervisor + CONTAINER_ENGINE=podman IMAGE_TAG=dev mise run build:container:supervisor if ! podman image exists "${supervisor_image}" >/dev/null 2>&1; then echo "ERROR: expected supervisor image '${supervisor_image}' after build" >&2 diff --git a/tasks/scripts/stage-prebuilt-binaries.sh b/tasks/scripts/stage-prebuilt-binaries.sh index 130c47d27..bf5bacd7d 100755 --- a/tasks/scripts/stage-prebuilt-binaries.sh +++ b/tasks/scripts/stage-prebuilt-binaries.sh @@ -147,7 +147,7 @@ build_component_for_arch() { resolve_component "$component" target="$(target_triple "$arch" "$target_libc")" - stage="${ROOT}/deploy/docker/.build/prebuilt-binaries/${arch}" + stage="${ROOT}/deploy/container/.build/prebuilt-binaries/${arch}" features="${EXTRA_CARGO_FEATURES:-openshell-core/dev-settings}" if [[ "$component" == "gateway" && " ${features} " != *" bundled-z3 "* ]]; then features="${features} bundled-z3" @@ -172,7 +172,7 @@ build_component_for_arch() { else echo "Error: cannot build ${binary} for linux/${arch} on ${current_host_os}/${current_host_arch}." >&2 echo "Install cargo-zigbuild + zig, build on a matching Linux host, or provide prebuilt binaries in:" >&2 - echo " deploy/docker/.build/prebuilt-binaries/${arch}/" >&2 + echo " deploy/container/.build/prebuilt-binaries/${arch}/" >&2 exit 1 fi fi