Skip to content

feat(knowledge): expose Cohere reranker controls#4429

Merged
waleedlatif1 merged 4 commits intostagingfrom
waleedlatif1/check-reranker-logs
May 4, 2026
Merged

feat(knowledge): expose Cohere reranker controls#4429
waleedlatif1 merged 4 commits intostagingfrom
waleedlatif1/check-reranker-logs

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

@waleedlatif1 waleedlatif1 commented May 4, 2026

Summary

  • Add self-hosted Cohere API key field on the knowledge block, mirroring the agent block's hosted-key pattern (getCohereRerankerApiKeyCondition) so BYOK works for self-hosted while staying invisible on hosted
  • Add rerankerInputCount (1–100) advanced field to control how many vector results are sent to the Cohere reranker (defaults to 4× topK, capped at 100)
  • Surface meta.warnings from Cohere /v2/rerank responses via logger.warn
  • All new contract fields are optional + nullable — fully backwards compatible with existing hosted users

Test plan

  • Hosted: reranker still works without an API key field appearing
  • Self-hosted: Cohere API key field appears when reranker is enabled
  • rerankerInputCount clamps to [1, 100] and respects the topK floor
  • Cohere meta warnings show up in logs when present
  • bun run check:api-validation passes

Add a self-hosted Cohere API key field (mirroring the agent block's hosted-key
pattern), a configurable reranker input pool size (1-100), and surface
meta.warnings from Cohere rerank responses via logger.warn. All new contract
fields are optional and nullable for full backwards compatibility.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped May 4, 2026 6:11pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented May 4, 2026

PR Summary

Medium Risk
Moderate risk: changes touch search request validation and reranker API-key handling, which could affect rerank behavior/costs or accidentally expose/require credentials in self-hosted setups.

Overview
Adds new Cohere reranker controls to Knowledge search, including an advanced rerankerInputCount to tune how many vector candidates are sent for reranking (clamped 1–100 and never below topK).

Supports self-hosted Cohere usage by adding an optional per-request rerankerApiKey (surfaced as a conditional Knowledge block field) while keeping the field hidden on hosted or when NEXT_PUBLIC_COHERE_CONFIGURED is set; reranker key resolution now prefers user key on self-hosted, otherwise BYOK/env/rotation. Also logs Cohere /v2/rerank meta.warnings and documents the new env/Helm settings for Cohere configuration.

Reviewed by Cursor Bugbot for commit 5803229. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 4, 2026

Greptile Summary

This PR exposes three new Cohere reranker controls on the Knowledge block: a BYOK apiKey field (hidden on hosted and when NEXT_PUBLIC_COHERE_CONFIGURED=true), a rerankerInputCount advanced field to control oversample size (1–100, defaults to topK × 4), and meta.warnings surfacing from Cohere rerank responses. All new contract fields are optional/nullable and fully backwards-compatible.

Confidence Score: 5/5

Safe to merge — no bugs found; previous review feedback was addressed and all new fields are backwards-compatible.

All changed files reviewed thoroughly. The key resolution chain, field visibility conditions, Zod schema, client-side clamping, and server-side guard logic are all correct. The required: true re-introduction in commit 232b1ca is intentional and correctly paired with the isCohereConfigured hide-field mechanism. No P0 or P1 issues found.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/lib/knowledge/reranker.ts Adds userApiKey param to resolveCohereKey, adds meta.warnings surfacing via logger.warn, and extends CohereRerankResponse interface. Logic is correct: self-hosted key takes priority, BYOK and env fallbacks are preserved.
apps/sim/blocks/blocks/knowledge.ts Adds rerankerInputCount (advanced short-input) and apiKey (Cohere key, hidden via getCohereRerankerApiKeyCondition). Both inputs registered in inputs schema. required: true on apiKey is intentional — field is hidden by the condition when isCohereConfigured.
apps/sim/blocks/utils.ts Adds getCohereRerankerApiKeyCondition following the exact same pattern as getApiKeyCondition. Returns a thunk that hides the field on hosted or when isCohereConfigured, otherwise shows it when rerankerEnabled for search ops.
apps/sim/lib/api/contracts/knowledge/search.ts Adds rerankerInputCount (integer 1–100, optional/nullable with null→undefined transform) and rerankerApiKey (optional string with empty→undefined transform). Both fields are backwards-compatible.
apps/sim/app/api/knowledge/search/route.ts Honors rerankerInputCount for candidateTopK with topK floor and 100 ceiling; logs a warning when the value is raised; passes rerankerApiKey through to rerank(). Logic is correct.
apps/sim/tools/knowledge/search.ts Client-side parsing of rerankerInputCount (string→integer, clamped 1–100) and apiKey; both conditionally spread into the request body only when reranking is enabled.
apps/sim/lib/core/config/feature-flags.ts Adds isCohereConfigured flag from NEXT_PUBLIC_COHERE_CONFIGURED env var, following the same isTruthy(getEnv(...)) pattern as isAzureConfigured.
apps/sim/lib/core/config/env.ts Adds NEXT_PUBLIC_COHERE_CONFIGURED to the createEnv public schema. Consistent with existing NEXT_PUBLIC_* optional string pattern.
apps/sim/.env.example Documents both COHERE_API_KEY and NEXT_PUBLIC_COHERE_CONFIGURED, clearly pairing them as a required two-var setup for self-hosted deployments with pre-configured Cohere keys.
helm/sim/values.yaml Adds NEXT_PUBLIC_COHERE_CONFIGURED with documentation comments, mirroring the existing Azure pattern.

Reviews (3): Last reviewed commit: "fix(knowledge): treat empty rerankerInpu..." | Re-trigger Greptile

Comment thread apps/sim/blocks/blocks/knowledge.ts
Comment thread apps/sim/lib/api/contracts/knowledge/search.ts
Comment thread apps/sim/app/api/knowledge/search/route.ts
- Drop required:true on apiKey field — server has BYOK→env→rotation fallback
  chain, so self-hosted users with COHERE_API_KEY env should not be blocked
- Drop .min(1) on rerankerApiKey contract field so empty strings coerce to
  undefined via the transform (matches the existing query field pattern)
- Log a warning when rerankerInputCount is clamped up to topK so users notice
  their setting was overridden

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/check-reranker-logs branch from 1cfdfaa to 3fcd053 Compare May 4, 2026 17:57
…anker

Restore required:true on the Cohere API Key field and hide it server-side via
a new NEXT_PUBLIC_COHERE_CONFIGURED public env flag — same pattern the Agent
block uses for Azure (NEXT_PUBLIC_AZURE_CONFIGURED). Self-hosters who set
COHERE_API_KEY in their environment also set NEXT_PUBLIC_COHERE_CONFIGURED=true,
which removes the field from the UI; everyone else sees a required field.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/blocks/blocks/knowledge.ts
Comment thread apps/sim/tools/knowledge/search.ts
An empty string from the Documents Sent to Reranker input passed the
undefined/null guard, so Number('') = 0 → clamped to 1, sending only 1
document to the reranker instead of falling back to the 4× topK auto
default. Add the empty-string check to the guard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 5803229. Configure here.

@waleedlatif1 waleedlatif1 merged commit 57dc745 into staging May 4, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/check-reranker-logs branch May 4, 2026 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant