Skip to content

feat(providers): add credential refresh foundation#1349

Open
johntmyers wants to merge 14 commits into
mainfrom
feat/1306-provider-credential-refresh/johntmyers
Open

feat(providers): add credential refresh foundation#1349
johntmyers wants to merge 14 commits into
mainfrom
feat/1306-provider-credential-refresh/johntmyers

Conversation

@johntmyers
Copy link
Copy Markdown
Collaborator

@johntmyers johntmyers commented May 13, 2026

Summary

Adds the first provider credential refresh implementation slice for #1306. This PR wires refresh metadata through profiles, providers, gateway APIs, provider environment resolution, and sandbox placeholder resolution, and now includes gateway-owned token minting for OAuth2 refresh-token, OAuth2 client-credentials, and Google service-account JWT credentials.

Related Issue

Refs #1306

Changes

  • Adds provider profile credential refresh metadata and provider credential expiry fields to the proto model.
  • Adds refresh status/configure/rotate/delete RPCs and CLI commands:
    • openshell provider refresh status NAME [--credential-key KEY]
    • openshell provider refresh configure NAME --credential-key KEY --strategy STRATEGY --material KEY=VALUE --secret-material-key KEY --credential-expires-at TIMESTAMP_MS
    • openshell provider refresh rotate NAME --credential-key KEY
    • openshell provider refresh delete NAME --credential-key KEY
    • openshell provider update NAME --credential-expires-at KEY=TIMESTAMP_MS
  • Stores provider refresh runtime state as scoped objects in the existing objects table using scope = provider_id.
  • Adds a gateway refresh worker and immediate provider refresh rotate path that mint short-lived access tokens and write the current token back to the provider record.
  • Adds secret-safe gateway logs for refresh worker startup, sweep summaries, each watched provider credential, due/rotation status, next refresh timing, refresh starts, refresh completions, and refresh failures.
  • Supports gateway-owned minting for:
    • OAuth2 refresh-token exchange, including rotated refresh tokens.
    • Microsoft/O365-style OAuth2 client credentials.
    • Google service-account JWT exchange, including optional delegated subject.
  • Keeps token endpoints profile-owned: refresh material cannot override token_url/token_uri, and profile-required refresh material is validated at configure time.
  • Enforces provider profile max_lifetime_seconds as a cap on minted token lifetime even when the provider token endpoint returns a longer expires_in.
  • Adds built-in refresh-backed provider profiles for outlook and google-drive.
  • Allows creating provider records without an initial static credential only when all required profile credentials are gateway-mintable refresh credentials.
  • Expires credentials in gateway provider env resolution and carries expiry metadata to sandbox credential snapshots.
  • Makes sandbox placeholder resolution fail closed for expired retained provider credential generations.

UX Changes

Static credential flows continue to work. Users can still create providers with injected current credentials and optionally annotate expiration timestamps:

openshell provider update github-work \
  --credential-expires-at GITHUB_TOKEN=1767225600000

Refresh-backed providers can now be created before a current access token exists when the profile's required credentials are gateway-mintable. Profiles that still require a static credential continue to require that credential at create time.

OAuth2 refresh-token providers can be configured with refresh material and then rotated immediately:

openshell provider create --name microsoft-work --type outlook

openshell provider refresh configure microsoft-work \
  --credential-key MS_GRAPH_ACCESS_TOKEN \
  --strategy oauth2_refresh_token \
  --material client_id="$MS_CLIENT_ID" \
  --material refresh_token="$MS_REFRESH_TOKEN" \
  --secret-material-key refresh_token

openshell provider refresh rotate microsoft-work \
  --credential-key MS_GRAPH_ACCESS_TOKEN

OAuth2 client-credentials providers use the same flow:

openshell provider refresh configure microsoft-work \
  --credential-key MS_GRAPH_ACCESS_TOKEN \
  --strategy oauth2_client_credentials \
  --material tenant_id="$MS_TENANT_ID" \
  --material client_id="$MS_CLIENT_ID" \
  --material client_secret="$MS_CLIENT_SECRET" \
  --secret-material-key client_secret

Google Drive service-account refresh follows the same provider lifecycle:

openshell provider create --name drive-work --type google-drive

openshell provider refresh configure drive-work \
  --credential-key GOOGLE_DRIVE_ACCESS_TOKEN \
  --strategy google_service_account_jwt \
  --material client_email="$GOOGLE_CLIENT_EMAIL" \
  --material private_key="$GOOGLE_PRIVATE_KEY" \
  --secret-material-key private_key

openshell provider refresh rotate drive-work \
  --credential-key GOOGLE_DRIVE_ACCESS_TOKEN

Refresh status now exposes operational state without printing secrets:

openshell provider refresh status microsoft-work
PROVIDER                CREDENTIAL_KEY                STRATEGY                     STATUS              EXPIRES_AT            NEXT_REFRESH          LAST_REFRESH          LAST_ERROR
microsoft-work          MS_GRAPH_ACCESS_TOKEN         oauth2_refresh_token         ready               2026-01-01 00:00:00   2025-12-31 23:50:00   2025-12-31 23:00:00   -

Empty status output now distinguishes whole-provider checks from single-credential checks:

No refresh configurations found for provider 'microsoft-work'.
No refresh configuration found for provider 'microsoft-work' credential 'MS_GRAPH_ACCESS_TOKEN'.

openshell provider refresh delete NAME --credential-key KEY removes the refresh state. It clears credential_expires_at_ms only when that expiry was owned by the refresh state; manually set expiry from openshell provider update --credential-expires-at is preserved.

When providers_v2_enabled=true, these profiles also contribute provider policy layers and profile-backed credential injection for attached/created sandboxes:

openshell settings set --global --key providers_v2_enabled --value true --yes

openshell sandbox create --name provider-refresh-smoke \
  --provider microsoft-work \
  --provider drive-work \
  -- echo ok

Current Behavior

  • The current provider credential value remains the injectable source of truth.
  • Gateway-owned refresh writes minted short-lived tokens back to the provider record and updates credential expiry metadata.
  • provider refresh configure accepts gateway-mintable strategies only: oauth2_refresh_token, oauth2_client_credentials, and google_service_account_jwt.
  • Static and external refresh-style updates are handled through openshell provider update, not provider refresh configure.
  • provider refresh rotate performs an immediate gateway-managed refresh for supported strategies.
  • The background refresh worker checks persisted refresh state and refreshes due credentials.
  • Gateway logs identify which provider credentials are watched, whether they are due, their status, and seconds until expiry/next refresh without logging token values or refresh material.
  • Token endpoints come from provider profiles only; configure requests cannot inject endpoint URLs through refresh material.
  • Empty provider creation is allowed only when every required credential in the profile can be gateway-minted by refresh.
  • Expired credentials are skipped by gateway provider environment resolution.
  • Existing sandbox placeholder resolution rejects expired retained credential generations.
  • External refresh systems can still push new current credentials through openshell provider update.

Testing

  • RUSTC_WRAPPER= cargo test -p openshell-server -p openshell-cli -p openshell-providers -p openshell-sandbox --no-run
  • RUSTC_WRAPPER= cargo test -p openshell-server provider_refresh -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-providers -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server provider_validation_errors -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-cli provider_create_allows_empty_credentials_for_gateway_refresh_profiles -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-cli provider_refresh_cli_run_functions_wire_requests -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-cli --lib -j1 refresh_status_table_includes_operational_fields -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server --lib -j1 configure_provider_refresh_stores_scoped_status_and_provider_expiry -- --nocapture
  • RUSTC_WRAPPER= cargo test -p openshell-server -j1 delete_provider_refresh_preserves_manually_updated_expiry -- --nocapture
  • RUSTC_WRAPPER= cargo clippy -p openshell-server --all-targets -- -D warnings
  • RUSTC_WRAPPER= cargo clippy -p openshell-cli --lib --tests -j1 -- -D warnings
  • RUSTC_WRAPPER= cargo test -p openshell-cli http_health_check_supports_plain_http_endpoints -- --nocapture
  • mise run pre-commit

Checklist

  • Tests pass locally
  • Documentation updated for user-facing behavior
  • No secrets or credentials committed

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 13, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@johntmyers johntmyers force-pushed the feat/1306-provider-credential-refresh/johntmyers branch from 043004c to 1b9aea7 Compare May 13, 2026 15:08
@github-actions
Copy link
Copy Markdown

@johntmyers johntmyers force-pushed the feat/1306-provider-credential-refresh/johntmyers branch 7 times, most recently from 49554d0 to 67e675d Compare May 18, 2026 17:30
@johntmyers johntmyers marked this pull request as ready for review May 18, 2026 17:30
@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label May 18, 2026
@github-actions
Copy link
Copy Markdown

Label test:e2e applied for 67e675d. Open the existing run and click Re-run all jobs to execute with the label set. The E2E Gate check on this PR will flip green automatically once the run finishes.

@TaylorMutch TaylorMutch self-assigned this May 18, 2026
@TaylorMutch
Copy link
Copy Markdown
Collaborator

TaylorMutch commented May 18, 2026

(From Codex)

Finding 1/3: New refresh write RPCs need provider admin authorization.

The new RPCs in proto/openshell.proto (ConfigureProviderRefresh, RotateProviderCredential, DeleteProviderRefresh) can store refresh material and rotate provider credentials, but crates/openshell-server/src/auth/authz.rs still only marks create/update/delete provider as admin methods and only maps those three to provider:write. In RBAC mode these new write RPCs fall through to normal user access.

Please add refresh status as provider:read and the refresh writes as admin/provider:write before this lands.

@TaylorMutch
Copy link
Copy Markdown
Collaborator

TaylorMutch commented May 18, 2026

(From Codex)

Finding 2/3: Refresh-bootstrap providers can bypass credential-key collision checks before the first token is minted.

The uniqueness check only considers currently materialized provider.credentials keys (active_provider_credential_keys in crates/openshell-server/src/grpc/provider.rs). Empty refresh-backed providers have no current credential yet, so two such providers can be attached to the same sandbox and configure the same env key, then only fail later when rotation tries to write the minted token.

Please treat configured refresh credential_key values as reserved active keys for sandbox create/attach and provider refresh configuration validation.

@TaylorMutch
Copy link
Copy Markdown
Collaborator

TaylorMutch commented May 18, 2026

(From Codex)

Finding 3/3: CLI refresh bootstrap eligibility does not match the server rule.

The CLI helper in crates/openshell-cli/src/run.rs allows empty credentials if any profile credential has client-credentials or Google refresh, and it excludes oauth2_refresh_token. The server rule allows empty credentials only when all required profile credentials are gateway-mintable, including OAuth refresh-token profiles.

That makes delegated refresh-token profiles fail too early in the CLI and mixed static/refresh profiles fail only after the request reaches the server. Please mirror the server predicate in the CLI.

Comment thread providers/google-drive.yaml Outdated
Comment thread providers/outlook.yaml Outdated
Comment thread crates/openshell-cli/src/main.rs Outdated
TaylorMutch
TaylorMutch previously approved these changes May 18, 2026
Copy link
Copy Markdown
Collaborator

@TaylorMutch TaylorMutch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving; follow ons after this core PR land seem good to me

@maxamillion
Copy link
Copy Markdown
Collaborator

+1 what @TaylorMutch said

@johntmyers johntmyers force-pushed the feat/1306-provider-credential-refresh/johntmyers branch from ba2aff0 to 79eae79 Compare May 18, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants