Skip to content

feat(policy): attested policy delivery over openshell.policy.v1alpha1.Engine wire#1669

Closed
dvavili wants to merge 4 commits into
NVIDIA:mainfrom
dvavili:dvavili/attested-policy-provider
Closed

feat(policy): attested policy delivery over openshell.policy.v1alpha1.Engine wire#1669
dvavili wants to merge 4 commits into
NVIDIA:mainfrom
dvavili:dvavili/attested-policy-provider

Conversation

@dvavili
Copy link
Copy Markdown

@dvavili dvavili commented Jun 1, 2026

Summary

Adds AttestedPolicyProvider driver that fetches per-sandbox policy from an out-of-process attested engine over a new OpenShell-owned wire contract.

Stacks on #1668 — includes the PolicyProvider trait + permits_mutation gate from that PR; the new commits here are the two on top.

  • Wire openshell.policy.v1alpha1.Engine (proto/policy.proto): four RPCs (Health, AcquireHandle, GetProjection, ReleaseHandle). Owned by OpenShell; any conforming engine is valid. Enforcement-shaped subset — no per-request Authorize, no GetPolicyDigest (the gateway enforces locally via OPA + Landlock + seccomp + nft once it has the projection).
  • Driver AttestedPolicyProvider: configured via [openshell.policy] type = "attested" + source_uds_path + trust_store_path. Mutators inherit Unsupported defaults, so the permits_mutation gate plus the three RPC mutators all refuse uniformly under the attested type.
  • Abstraction: driver speaks a PolicySource trait; GrpcPolicySource (the production impl) is the only file that touches the generated proto types. Driver is unit-testable with a MockPolicySource — no tonic server stood up in tests.
  • Trust store: multi-key Ed25519 JSON loader; distinct error variant per malformed shape (missing, unreadable, malformed JSON, zero keys, duplicate id, empty id, empty/malformed PEM).
  • Cleanup: the PolicyProviderRegistry from feat(policy): pluggable PolicyProvider subsystem with mutation gate #1668 is replaced with a direct match in the resolver. The registry was paying complexity cost for two providers with divergent construction shapes (sync Store vs. async UDS + trust-store load).

Admission flow: build RuntimeContext → acquire_handle → get_projection("openshell.sandbox.v1") → verify signature against trust store by signing_key_id → decode body → release_handle. Verification fails closed on unknown key id or bad signature; unsigned envelopes admit with a warning (v0 fallback that auto-disables once the engine starts signing). Handle release runs on every exit path so engine-side state never leaks.

Test plan

  • cargo test -p openshell-server --lib -- --test-threads=1 — 763 passing (29 new: 12 trust-store loader, 4 source, 9 driver via MockPolicySource, 4 resolver/config wiring)
  • cargo build -p openshell-server clean
  • End-to-end against a real engine pending the server-side adapter that serves the Engine surface

Deferred follow-ups

Auth-mode gate (LocalDev rejection), audit tagging on admission/lifecycle events, gateway-side handle + signing-key persistence across restarts, release_handle on sandbox deletion (driver releases immediately today), removing the shadow seam at rpv_shadow.rs.

dvavili added 4 commits June 1, 2026 11:17
Promote policy delivery to a pluggable subsystem in the gateway. The
gRPC handlers route sandbox-scoped set, merge-ops, and global delete
through a registered PolicyProvider; the existing in-tree store is
exposed as LocalPolicyProvider, preserving today's behavior.

- PolicyProvider trait: id, get_effective_policy, plus three mutators
  (set_policy, update_policy, delete_policy) with default impls
  returning PolicyError::Unsupported { policy_type, operation }.
- LocalPolicyProvider wraps Store. set_policy fully delegates;
  update_policy and delete_policy gate the operation through the
  trait while the handler retains the merge-with-retry loop and the
  global-settings-map mutation respectively (asymmetry documented in
  the trait module doc).
- [openshell.policy] type = "local" | "attested" added to the config
  file, aligning with the existing `type`-selector convention used by
  ProviderPlugin. "attested" is accepted as a known-but-not-yet-
  available value (clear startup error rather than panic or unknown-
  type error).
- PolicyError::Unsupported maps to tonic::Status::unimplemented, so
  any provider that does not override the three mutators (e.g. the
  forthcoming AttestedPolicyProvider) refuses set/update/delete
  without per-handler conditionals.

Covered by 12 new unit tests (policy_provider trait defaults + Local
flows over a fresh sqlite store) and 4 new handler-level integration
tests (Local set succeeds; refusing provider yields unimplemented on
set, merge-ops, and global delete).
The agentic approval loop introduced in NVIDIA#1528 added seven chunk-mutation
RPCs (submit_policy_analysis, approve/reject/approve_all/edit/undo/clear
_draft_chunk) that mutate policy state through the draft-chunk store
without going through the three set/update/delete mutators. The
PolicyProvider trait gated only the latter, leaving the chunk surface
as a bypass for any read-only driver.

Add `permits_mutation()` to the trait with a default `Unsupported`
return; override in LocalPolicyProvider to `Ok(())`. Thread the gate
through every mutation entry point in grpc/policy.rs as the first
post-authz action, so a refusing provider fails fast without
touching the store:

- handle_submit_policy_analysis
- handle_approve_draft_chunk
- handle_reject_draft_chunk
- handle_approve_all_draft_chunks
- handle_edit_draft_chunk
- handle_undo_draft_chunk
- handle_clear_draft_chunks
- handle_update_config sandbox-set, sandbox-merge, global-delete arms
- handle_update_config global-set inline arm (separate from the
  per-op trait path)

The per-op set_policy/update_policy/delete_policy provider calls
remain in place as the natural seam for a future driver that wants
to override one mutation without refusing all of them; permits_mutation
is the coarse gate that runs first.

Tests: 7 new chunk-handler refusing_provider integration tests (each
asserts the draft-chunk store is unchanged after the refused call),
plus permits_mutation_is_ok_for_local_provider and
stub_provider_permits_mutation_returns_unsupported. 3 pre-existing
refusing_provider tests updated to expect `mutation` instead of the
per-op operation name.

handle_report_policy_status is intentionally not gated — it is the
sandbox-side load-result pong, not a control-plane mutation; gating
would break legitimate sandbox status reports under attested mode.
…solver

Phase A modeled PolicyProviderRegistry after openshell-providers'
ProviderRegistry to "promote policy to a first-class subsystem" on par
with AI providers. The trait + types half of that mirror earned its
keep. The registry half did not.

ProviderRegistry is valuable when (a) there are many implementations
(~10+) so name-based dispatch is the main concern, (b) all impls share
a uniform construction shape (same config record), and (c) no impl
needs external resources or async setup at construction. Policy
providers fail all three: two impls only (local, attested), divergent
construction shapes (LocalPolicyProvider::new takes a sync Store; the
forthcoming AttestedPolicyProvider needs file-loaded trust store + UDS
dial + async health round-trip), and the attested case requires
external resources at construction.

The forced abstraction had already manifested as a special-case branch
in resolve_policy_provider that fell through the registry lookup for
the `attested` type, with a defensive comment explaining why the
registry only held one entry. That comment was the tell.

Replace with a direct `match policy_type { ... }` in
resolve_policy_provider. The `attested` arm still returns the same
"policy type 'attested' is not yet available" startup error pending
its implementation. Trait, error types, contexts, and policy-type-id
constants stay. Mirror the trait, not the registry.

Tests: registry_lookup_returns_registered_provider removed with the
registry. Pre-existing 735 lib tests minus that one = 734 pass
single-threaded.
Define an out-of-process policy delivery contract owned by OpenShell
and a driver that consumes it. With `[openshell.policy] type =
"attested"` plus a UDS path and a multi-key Ed25519 trust store, the
gateway fetches per-sandbox policy from whichever process serves the
wire; mutation RPCs and the chunk-approval surface continue to refuse
via the trait defaults and the `permits_mutation` gate.

Wire: `openshell.policy.v1alpha1.Engine` (Health, AcquireHandle,
GetProjection, ReleaseHandle). The driver consumes a `PolicySource`
trait; `GrpcPolicySource` is the only file touching proto types so the
driver stays implementation-agnostic and unit-testable with a mock
source. Multi-key trust store rejects each malformed shape with a
distinct error variant.

Envelope signature verification flips closed automatically when the
wire starts emitting signed envelopes; empty signatures admit with a
warning today. Gateway runtime-context signing key is per-process and
paired with handle persistence in the follow-up.

Tests: 29 new (12 trust-store loader, 4 source, 9 attested via mock,
4 resolver/config wiring). 763 lib tests pass single-threaded.

Deferred: auth-mode gate (LocalDev rejection), audit tagging, handle
persistence, releasing handles on sandbox deletion, removing the
shadow seam at rpv_shadow.rs.
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Thank you for your interest in contributing to OpenShell, @dvavili.

This project uses a vouch system for first-time contributors. Before submitting a pull request, you need to be vouched by a maintainer.

To get vouched:

  1. Open a Vouch Request discussion.
  2. Describe what you want to change and why.
  3. Write in your own words — do not have an AI generate the request.
  4. A maintainer will comment /vouch if approved.
  5. Once vouched, open a new PR (preferred) or reopen this one after a few minutes.

See CONTRIBUTING.md for details.

@github-actions github-actions Bot closed this Jun 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 1, 2026

Thank you for your submission! We ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by adding a comment below using this text:


I have read the DCO document and I hereby sign the DCO.


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the DCO Assistant Lite bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant