docs: clarify BYOK + Custom Inference request path and data flow#138
docs: clarify BYOK + Custom Inference request path and data flow#138hongyi-chen wants to merge 7 commits into
Conversation
The BYOK doc previously said keys are "stored locally" (true) and that Warp "directly routes" requests to the provider (misleading — the Warp Agent harness is server-hosted, so traffic does transit Warp's backend while the key is used in-flight per request). This commit: - Replaces "directly route" language with an explicit 3-step data flow. - Adds a "Why does the request route through Warp's backend?" note explaining the server-side harness. - Adds a sentence to the ZDR section noting BYOK request bodies are not retained, used for training, or logged for analytics. - Tightens the diagram alt text and intro paragraph to remove the same "directly" ambiguity. Co-Authored-By: Oz <oz-agent@warp.dev>
…ok-data-flow # Conflicts: # src/content/docs/agent-platform/inference/bring-your-own-api-key.mdx
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR clarifies the BYOK documentation by explaining that BYOK agent requests transit Warp's backend and that the user-provided provider key is used in flight. The added data-flow description is useful, but one new privacy/retention statement is broader than the existing privacy documentation supports.
Concerns
- The new ZDR section says Warp does not retain the BYOK request body or log it for analytics, but the privacy documentation describes account-level telemetry and plan settings that can affect AI interaction collection. This should be scoped before merge.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
…ey.mdx Co-authored-by: oz-for-oss[bot] <277970191+oz-for-oss[bot]@users.noreply.github.com>
Replace specific feature list (Codebase Context, Rules, Secret Redaction, multi-step tool orchestration) with a more general 'Warp's agent harness' reference. Keeps the explanation accurate without enumerating internals that may evolve. Co-Authored-By: Oz <oz-agent@warp.dev>
Issue #11681 reported that the privacy framing on the Custom Inference
endpoint page was misleading: requests are server-hosted through
warp-server, so traffic does transit Warp's backend even though the
endpoint URL and API key are stored locally on the client.
This commit narrows and corrects the privacy claim on the Custom
Inference endpoint page, mirroring the BYOK rewrite already in this PR:
- Replace the blanket 'never synced to the cloud' wording for endpoint
URLs with a narrower, accurate claim: API keys are never synced or
stored on Warp's servers; endpoint URLs and model identifiers may
appear in Warp's usage telemetry, but API keys never do.
- Add an explicit 3-step request flow (harness assembles -> in-flight
key authenticates the call -> response streams back) so the
server-side path is no longer surprising.
- Add a 'Why does the request route through Warp's backend?' callout
matching the BYOK page.
- Tighten the ZDR section to note that prompts/responses transit Warp's
backend without being used for training, and scope the existing
retention bullets to the provider side.
Also align the BYOK headline claim with the same wording ('never synced
or stored on Warp's servers') so both pages converge on a single
phrasing.
Confirmed against warp-server:
- logic/ai/llm/custom_endpoint/client.go:14-21 - the OpenAI-compatible
client is constructed server-side using
hostConfig.CustomEndpointAPIKey() and hostConfig.CustomEndpointBaseURL()
from the request, not from persistent server config.
- logic/ai/llm/user_api_keys/util.go:7 - keys arrive per-request via
Request_Settings_ApiKeys.
Co-Authored-By: Oz <oz-agent@warp.dev>
Per review feedback, simplify the Custom Inference endpoint privacy framing to a single durable claim — API keys are never synced or stored on Warp's servers — without adding a separate caveat about endpoint URL or model identifier telemetry. Co-Authored-By: Oz <oz-agent@warp.dev>
Summary
Clarifies the request path and data-flow framing for both BYOK and Custom Inference endpoints in our docs, in response to warpdotdev/warp#11681 and the follow-up triage thread.
The previous wording on both pages combined two claims that aren't equivalent:
warp-server, not from the client.Both the recent r/warpdotdev complaint and issue #11681 (Custom Inference) traced back to this framing.
Per Daniel's clarification in the linked thread, the narrow durable claim Warp wants to stand behind is: API keys are never synced or stored on Warp's servers.
Confirmed against
warp-server:logic/ai/llm/anthropic/util/util.go:1032-1034— server overrides the Anthropic SDK API key with the user-provided one per request.logic/ai/llm/user_api_keys/util.go:7— keys arrive in the request payload asRequest_Settings_ApiKeys.logic/ai/llm/llm_role.go:723— server-side model routing applies BYOK preferences viaWithApiKeyConfigApplied.logic/ai/llm/custom_endpoint/client.go:14-21— the OpenAI-compatible client is constructed server-side withoption.WithAPIKey(hostConfig.CustomEndpointAPIKey())andoption.WithBaseURL(...CustomEndpointBaseURL()); both come in on the request.Changes
src/content/docs/agent-platform/inference/bring-your-own-api-key.mdxHow BYOK workssection: replaced the single "directly route your agent requests" line with an explicit 3-step data flow (harness assembles request → key authenticates the call in-flight → response streams back), and clarified that keys live in-memory only for the duration of each request.Why does the request route through Warp's backend?note explaining the server-side harness (same runtime as Agent Mode with Warp-billed models).#how-does-byok-workbut the heading onmainis nowHow BYOK works(slug#how-byok-works).src/content/docs/agent-platform/inference/custom-inference-endpoint.mdxKey features: rewrote the "Local configuration" bullet — API keys are stored locally on the device and never synced or stored on Warp's servers.How it works: replaced the blanket "never synced to Warp's servers" wording with an explicit 3-step request flow mirroring the BYOK rewrite.Why does the request route through Warp's backend?callout matching the BYOK page, explicitly cross-linking to BYOK so readers see the consistent posture.Additional context
python3 .agents/skills/check_for_broken_links/check_links.py --internal-only→ 2702 internal links checked, 0 broken).Conversation: https://staging.warp.dev/conversation/c3a085dc-4658-47c2-9908-e7f56672872f
Run: https://oz.staging.warp.dev/runs/019e665f-0f94-780b-9764-04bdbd28a24b
Plans:
This PR was generated with Oz.