Skip to content

docs: clarify BYOK + Custom Inference request path and data flow#138

Open
hongyi-chen wants to merge 7 commits into
mainfrom
hongyichen/clarify-byok-data-flow
Open

docs: clarify BYOK + Custom Inference request path and data flow#138
hongyi-chen wants to merge 7 commits into
mainfrom
hongyichen/clarify-byok-data-flow

Conversation

@hongyi-chen
Copy link
Copy Markdown
Collaborator

@hongyi-chen hongyi-chen commented May 26, 2026

Summary

Clarifies the request path and data-flow framing for both BYOK and Custom Inference endpoints in our docs, in response to warpdotdev/warp#11681 and the follow-up triage thread.

The previous wording on both pages combined two claims that aren't equivalent:

  1. Storage: API keys are stored locally (true and unchanged).
  2. Transit: requests are routed "directly" to the model provider — that part was misleading. The Warp Agent harness is server-hosted, so requests do transit Warp's backend; the key is passed in-flight per request and used to authenticate the call from warp-server, not from the client.

Both the recent r/warpdotdev complaint and issue #11681 (Custom Inference) traced back to this framing.

Per Daniel's clarification in the linked thread, the narrow durable claim Warp wants to stand behind is: API keys are never synced or stored on Warp's servers.

Confirmed against warp-server:

  • logic/ai/llm/anthropic/util/util.go:1032-1034 — server overrides the Anthropic SDK API key with the user-provided one per request.
  • logic/ai/llm/user_api_keys/util.go:7 — keys arrive in the request payload as Request_Settings_ApiKeys.
  • logic/ai/llm/llm_role.go:723 — server-side model routing applies BYOK preferences via WithApiKeyConfigApplied.
  • logic/ai/llm/custom_endpoint/client.go:14-21 — the OpenAI-compatible client is constructed server-side with option.WithAPIKey(hostConfig.CustomEndpointAPIKey()) and option.WithBaseURL(...CustomEndpointBaseURL()); both come in on the request.

Changes

src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx

  • Intro paragraph: dropped "access models directly" wording.
  • How BYOK works section: replaced the single "directly route your agent requests" line with an explicit 3-step data flow (harness assembles request → key authenticates the call in-flight → response streams back), and clarified that keys live in-memory only for the duration of each request.
  • Headline storage claim now uses Daniel's wording: "never synced or stored on Warp's servers".
  • Added a Why does the request route through Warp's backend? note explaining the server-side harness (same runtime as Agent Mode with Warp-billed models).
  • ZDR section: added a sentence noting BYOK request bodies transit Warp's backend but are not retained, used for training, or logged for analytics — same posture as Warp-billed traffic. Scoped the existing "data retention policies depend on..." bullet to be explicit it's about the provider side.
  • Tightened the diagram alt text from "directly through your provider API key" → "authenticates BYOK agent requests with your provider API key".
  • Fixed a stale anchor: the ZDR section linked to #how-does-byok-work but the heading on main is now How BYOK works (slug #how-byok-works).

src/content/docs/agent-platform/inference/custom-inference-endpoint.mdx

  • Key features: rewrote the "Local configuration" bullet — API keys are stored locally on the device and never synced or stored on Warp's servers.
  • How it works: replaced the blanket "never synced to Warp's servers" wording with an explicit 3-step request flow mirroring the BYOK rewrite.
  • Added a Why does the request route through Warp's backend? callout matching the BYOK page, explicitly cross-linking to BYOK so readers see the consistent posture.
  • ZDR section: added the same "request bodies transit Warp's backend but aren't used for training" framing as BYOK, and scoped the existing retention bullets to the provider side.

Additional context

  • See the implementation plan for the research summary and the breakdown that informed this update.
  • Conversation thread.
  • No structural changes (no new top-level sections, no terminology drift), no sidebar config edits, no redirects required.
  • Internal link checker: 0 broken links (python3 .agents/skills/check_for_broken_links/check_links.py --internal-only → 2702 internal links checked, 0 broken).

Conversation: https://staging.warp.dev/conversation/c3a085dc-4658-47c2-9908-e7f56672872f
Run: https://oz.staging.warp.dev/runs/019e665f-0f94-780b-9764-04bdbd28a24b
Plans:

This PR was generated with Oz.

hongyi-chen and others added 2 commits May 26, 2026 11:05
The BYOK doc previously said keys are "stored locally" (true) and that
Warp "directly routes" requests to the provider (misleading — the Warp
Agent harness is server-hosted, so traffic does transit Warp's backend
while the key is used in-flight per request).

This commit:
- Replaces "directly route" language with an explicit 3-step data flow.
- Adds a "Why does the request route through Warp's backend?" note
  explaining the server-side harness.
- Adds a sentence to the ZDR section noting BYOK request bodies are not
  retained, used for training, or logged for analytics.
- Tightens the diagram alt text and intro paragraph to remove the same
  "directly" ambiguity.

Co-Authored-By: Oz <oz-agent@warp.dev>
…ok-data-flow

# Conflicts:
#	src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx
@cla-bot cla-bot Bot added the cla-signed label May 26, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment May 26, 2026 11:08pm

Request Review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 26, 2026

@hongyi-chen

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR clarifies the BYOK documentation by explaining that BYOK agent requests transit Warp's backend and that the user-provided provider key is used in flight. The added data-flow description is useful, but one new privacy/retention statement is broader than the existing privacy documentation supports.

Concerns

  • The new ZDR section says Warp does not retain the BYOK request body or log it for analytics, but the privacy documentation describes account-level telemetry and plan settings that can affect AI interaction collection. This should be scoped before merge.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx Outdated
…ey.mdx

Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
Replace specific feature list (Codebase Context, Rules, Secret Redaction,
multi-step tool orchestration) with a more general 'Warp's agent harness'
reference. Keeps the explanation accurate without enumerating internals
that may evolve.

Co-Authored-By: Oz <oz-agent@warp.dev>
Issue #11681 reported that the privacy framing on the Custom Inference
endpoint page was misleading: requests are server-hosted through
warp-server, so traffic does transit Warp's backend even though the
endpoint URL and API key are stored locally on the client.

This commit narrows and corrects the privacy claim on the Custom
Inference endpoint page, mirroring the BYOK rewrite already in this PR:

- Replace the blanket 'never synced to the cloud' wording for endpoint
  URLs with a narrower, accurate claim: API keys are never synced or
  stored on Warp's servers; endpoint URLs and model identifiers may
  appear in Warp's usage telemetry, but API keys never do.
- Add an explicit 3-step request flow (harness assembles -> in-flight
  key authenticates the call -> response streams back) so the
  server-side path is no longer surprising.
- Add a 'Why does the request route through Warp's backend?' callout
  matching the BYOK page.
- Tighten the ZDR section to note that prompts/responses transit Warp's
  backend without being used for training, and scope the existing
  retention bullets to the provider side.

Also align the BYOK headline claim with the same wording ('never synced
or stored on Warp's servers') so both pages converge on a single
phrasing.

Confirmed against warp-server:
- logic/ai/llm/custom_endpoint/client.go:14-21 - the OpenAI-compatible
  client is constructed server-side using
  hostConfig.CustomEndpointAPIKey() and hostConfig.CustomEndpointBaseURL()
  from the request, not from persistent server config.
- logic/ai/llm/user_api_keys/util.go:7 - keys arrive per-request via
  Request_Settings_ApiKeys.

Co-Authored-By: Oz <oz-agent@warp.dev>
@hongyi-chen hongyi-chen changed the title docs: clarify BYOK request path and data flow docs: clarify BYOK + Custom Inference request path and data flow May 26, 2026
Per review feedback, simplify the Custom Inference endpoint privacy
framing to a single durable claim — API keys are never synced or stored
on Warp's servers — without adding a separate caveat about endpoint
URL or model identifier telemetry.

Co-Authored-By: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants