Skip to content

feat: enable mTLS for Podman compute driver in e2e test harness #1428

@russellb

Description

@russellb

Problem Statement

Podman deployments cannot use encrypted gateway-to-sandbox communication (mTLS), which is the single biggest functional gap between the Podman and Docker compute drivers. This limits secure deployment on Podman. The Docker driver has full mTLS support exercised end-to-end; Podman explicitly runs plaintext-only gateways in its e2e harness, even though the driver code already implements the mTLS capability.

Technical Context

The Podman compute driver (crates/openshell-driver-podman/) already has complete code-level support for mTLS certificate injection into sandbox containers. The container spec construction, config validation, driver initialization, environment variable injection, and TOML config file inheritance are all implemented and unit-tested. The gap is not in the driver code — it is in the e2e test harness (e2e/with-podman-gateway.sh) which explicitly rejects https:// endpoints and forces plaintext operation.

Affected Components

Component Key Files Role
Podman e2e harness e2e/with-podman-gateway.sh Test harness that configures and launches Podman gateway — currently forces plaintext
Podman container spec crates/openshell-driver-podman/src/container.rs Builds container create payload — already handles TLS mounts and env vars
Podman config crates/openshell-driver-podman/src/config.rs TLS config fields and validation — already implemented
Podman driver crates/openshell-driver-podman/src/driver.rs Auto-detects https:// scheme, validates TLS — already implemented
Gateway config crates/openshell-server/src/config_file.rs Inherits guest_tls_{ca,cert,key} to Podman driver table — already implemented
Sandbox supervisor crates/openshell-sandbox/src/grpc_client.rs Reads TLS env vars, configures mTLS client — driver-agnostic, already works

Technical Investigation

Architecture Overview

The mTLS flow works identically for all compute drivers:

  1. Gateway side: The gateway generates or is provided a PKI (CA, server cert, client cert). It starts with --tls-cert, --tls-key, --tls-client-ca for its gRPC server.
  2. Config inheritance: guest_tls_{ca,cert,key} paths are set at the [openshell.gateway] level in TOML and inherited into driver-specific tables via config_file.rs:260-266.
  3. Driver injection: The driver bind-mounts the host-side cert files into the container at /etc/openshell/tls/client/{ca.crt,tls.crt,tls.key} and injects OPENSHELL_TLS_CA, OPENSHELL_TLS_CERT, OPENSHELL_TLS_KEY env vars.
  4. Supervisor connection: grpc_client.rs:30-80 reads the TLS env vars, builds a ClientTlsConfig with CA cert and client identity, and connects via tonic over https://.

Code References

Location Description
e2e/with-podman-gateway.sh:14 Comment stating "Podman e2e currently uses plaintext gateway traffic"
e2e/with-podman-gateway.sh:277-288 Guard that rejects https:// endpoints with error message
e2e/with-podman-gateway.sh:385 Gateway started with --disable-tls flag
e2e/with-podman-gateway.sh:404-405 Endpoint registered as http://, uses e2e_register_plaintext_gateway
crates/openshell-driver-podman/src/container.rs:287-300 TLS env var injection when config.tls_enabled()already implemented
crates/openshell-driver-podman/src/container.rs:533-565 TLS bind mount generation with Mount structs — already implemented
crates/openshell-driver-podman/src/container.rs:22-30 SELinux detection for :z relabel option on mounts — already implemented
crates/openshell-driver-podman/src/container.rs:1055-1139 Unit tests validating TLS mount and env var injection — already passing
crates/openshell-driver-podman/src/config.rs:98-108 guest_tls_{ca,cert,key} config fields — already defined
crates/openshell-driver-podman/src/config.rs:124-150 validate_tls_config() rejecting partial TLS configs — already implemented
crates/openshell-driver-podman/src/driver.rs:137-152 Auto-detection of https:// scheme based on tls_enabled()already implemented
crates/openshell-driver-docker/src/lib.rs:917-933 Docker's build_binds() TLS mount implementation — reference pattern
crates/openshell-driver-docker/src/lib.rs:976-989 Docker's TLS env var injection — reference pattern
e2e/with-docker-gateway.sh:392-425 Docker's PKI generation with openssl — reusable pattern
crates/openshell-server/src/config_file.rs:260-266 TOML config inheritance of TLS keys to Podman driver table
crates/openshell-sandbox/src/grpc_client.rs:30-80 Supervisor's driver-agnostic TLS client setup
architecture/compute-runtimes.md:14 States driver responsibilities include "Supplying TLS or secret material"
architecture/compute-runtimes.md:64-65 States driver-controlled env vars must override for TLS paths

Current Behavior

The Podman e2e harness:

  • Explicitly checks for https:// and exits with an error message
  • Starts the gateway with --disable-tls
  • Registers the CLI with a plaintext http:// endpoint
  • Uses e2e_register_plaintext_gateway instead of e2e_register_mtls_gateway

Meanwhile, the driver code silently supports mTLS — it just never gets exercised.

What Would Need to Change

Only e2e/with-podman-gateway.sh needs modification:

  1. Remove the https:// rejection guard (lines 277-288)
  2. Add openssl PKI generation block (reuse pattern from Docker e2e with-docker-gateway.sh:392-425, adjusting SANs to include host.containers.internal)
  3. Add guest_tls_ca, guest_tls_cert, guest_tls_key to the TOML config block
  4. Replace --disable-tls with --tls-cert, --tls-key, --tls-client-ca
  5. Change endpoint from http:// to https://
  6. Replace e2e_register_plaintext_gateway with e2e_register_mtls_gateway
  7. Update the health check curl to handle HTTPS (use --cacert or adjust health port)
  8. Remove the plaintext comment at line 14

No Rust code changes required. The Podman driver, config, and container spec already handle TLS correctly.

Alternative Approaches Considered

None — the implementation path is singular. The driver code is already written. The only question is how to generate the PKI in the test harness, and the Docker e2e pattern is directly reusable.

Patterns to Follow

The Docker e2e harness (e2e/with-docker-gateway.sh) is the exact pattern to follow:

  • Ephemeral PKI generation with openssl (CA → server cert → client cert)
  • SANs covering the host aliases the supervisor will connect through
  • TOML config with guest_tls_{ca,cert,key} paths
  • Gateway started with --tls-cert, --tls-key, --tls-client-ca
  • CLI registered via e2e_register_mtls_gateway

Proposed Approach

Mirror the Docker e2e's mTLS setup in the Podman e2e script. Generate ephemeral PKI with openssl, wire the certs into the TOML config and gateway flags, switch to https:// endpoints, and use the existing e2e_register_mtls_gateway helper. The Podman driver code is already complete — this is purely an e2e harness enablement change. The main nuance is ensuring the certificate SANs cover host.containers.internal (Podman's host alias) alongside host.openshell.internal.

Scope Assessment

  • Complexity: Low
  • Confidence: High — the driver code is already implemented and unit-tested; this is strictly e2e harness wiring
  • Estimated files to change: 1 (e2e/with-podman-gateway.sh)
  • Issue type: feat

Risks & Open Questions

  • Rootless Podman host-gateway routing with TLS: The server certificate SANs must cover both host.containers.internal and host.openshell.internal. This is a minor openssl.cnf adjustment in the e2e script, matching how Docker handles host.docker.internal.
  • macOS Podman machine: On macOS, Podman runs inside a VM. The host-gateway alias routes through the VM. TLS hostname verification depends on the SAN matching host.openshell.internal. This should work identically to Docker Desktop since both use host aliases.
  • File accessibility in rootless Podman: Cert files need to be readable by the Podman user. In rootless mode, the e2e script runs as the same user creating the certs, so they should be accessible. The :ro,rbind mount options and optional :z SELinux relabeling (already implemented in container.rs) handle the rest.
  • Health check endpoint: Need to confirm whether the gateway health endpoint also switches to HTTPS when TLS is enabled, or if it stays on a separate plaintext port. The Docker e2e pattern should clarify this.

Test Considerations

  • Primary validation: Run the full Podman e2e suite (mise run e2e with Podman driver) with mTLS enabled. The existing shared e2e tests should pass over encrypted connections without modification.
  • Test levels: e2e only — the unit tests for TLS mount/env injection already exist and pass (container.rs:1055-1139).
  • No new test infrastructure needed: The e2e_register_mtls_gateway helper and openssl cert generation pattern from the Docker script are directly reusable.
  • CI validation: The rust-podman CI suite in e2e-test.yml will exercise the mTLS path once the harness change lands.

Created by spike investigation. Use build-from-issue to plan and implement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions