Skip to content

Add Baidu Qianfan rerank retriever#6439

Open
jimmyzhuu wants to merge 5 commits into
FlowiseAI:mainfrom
jimmyzhuu:feature/baidu-qianfan-rerank-retriever
Open

Add Baidu Qianfan rerank retriever#6439
jimmyzhuu wants to merge 5 commits into
FlowiseAI:mainfrom
jimmyzhuu:feature/baidu-qianfan-rerank-retriever

Conversation

@jimmyzhuu
Copy link
Copy Markdown
Contributor

@jimmyzhuu jimmyzhuu commented May 27, 2026

Summary

This PR adds a Baidu Qianfan Rerank Retriever node for reranking vector store retrieval results with Qianfan rerank models.

Changes

  • Add Baidu Qianfan Rerank Retriever under Retrievers.
  • Add a dedicated Baidu Qianfan API Key credential for Qianfan bearer API keys.
  • Call Qianfan v2 rerank API directly at /v2/rerank.
  • Use bce-reranker-base as the default rerank model.
  • Preserve Flowise output modes:
    • Retriever
    • Document
    • Text
  • Attach relevance_score metadata to reranked documents.
  • Fall back to original documents when the Qianfan API call fails or returns invalid indexes.

Why

Flowise already supports Baidu Qianfan chat and embeddings. Adding Qianfan rerank support completes a common RAG workflow for Baidu users: retrieve candidates from a vector store, then rerank them with a dedicated cross-encoder reranker before sending context downstream.

Verification

  • node packages/components/node_modules/jest/bin/jest.js --config packages/components/jest.config.js BaiduQianfanRerank.test.ts BaiduQianfanRerankRetriever.test.ts
    • 2 test suites passed
    • 13 tests passed
  • git diff --check origin/main...HEAD
    • passed
  • Manual smoke test against Qianfan /v2/rerank
    • returned 2 reranked documents
    • ranked the weather-related document first for a weather query
    • returned numeric relevance scores

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Baidu Qianfan Rerank Retriever component, which includes credential configuration, unit tests, and an SVG icon. The retriever integrates with the Baidu Qianfan Rerank API to reorder documents based on semantic relevance. The feedback recommends validating the presence of the API key and parsing topN as an integer to prevent silent failures. Additionally, it is suggested to log errors in BaiduQianfanRerank instead of swallowing them silently, and to use handleDocumentLoaderOutput to format the retriever's output and reduce code duplication.

Comment on lines +112 to +114
const qianfanApiKey =
getCredentialParam('qianfanApiKey', credentialData, nodeData) || getCredentialParam('qianfanAccessKey', credentialData, nodeData)
const k = topN ? parseFloat(topN) : (baseRetriever as VectorStoreRetriever).k ?? 4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If both qianfanApiKey and qianfanAccessKey are missing, the node will proceed with an empty API key, leading to a failed API call that silently returns the original documents. We should throw an error if no API key is provided. Additionally, topN represents a count of documents and should be parsed as an integer using parseInt(topN, 10) rather than parseFloat(topN) to avoid potential decimal values.

        const qianfanApiKey =
            getCredentialParam('qianfanApiKey', credentialData, nodeData) || getCredentialParam('qianfanAccessKey', credentialData, nodeData)
        if (!qianfanApiKey) {
            throw new Error('Baidu Qianfan API Key is missing. Please configure it in the credential.')
        }
        const k = topN ? parseInt(topN, 10) : (baseRetriever as VectorStoreRetriever).k ?? 4
References
  1. When a feature requires a specific configuration (e.g., an API key for a sandboxed environment), it is preferable to throw an error if the configuration is missing rather than silently falling back to a different implementation.

Comment on lines +73 to +75
} catch (error) {
return documents
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Swallowing all errors silently makes debugging extremely difficult if the API key is invalid, rate limits are hit, or the model name is incorrect. Consider logging the error to console.error before returning the fallback documents so that users and administrators can troubleshoot issues.

        } catch (error) {
            console.error('Baidu Qianfan Rerank Error:', error)
            return documents
        }

Comment on lines +122 to +129
if (output === 'retriever') return retriever
if (output === 'document') return await retriever.invoke(query ? query : input)
if (output === 'text') {
const docs = await retriever.invoke(query ? query : input)
let finaltext = ''
for (const doc of docs) finaltext += `${doc.pageContent}\n`
return handleEscapeCharacters(finaltext, false)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To reduce code duplication and leverage existing utility functions, we can use handleDocumentLoaderOutput to format the retriever's output. Note: You will also need to import handleDocumentLoaderOutput from ../../../src/utils instead of handleEscapeCharacters.

        if (output === 'retriever') return retriever
        const docs = await retriever.invoke(query ? query : input)
        return handleDocumentLoaderOutput(docs, output)

@jimmyzhuu
Copy link
Copy Markdown
Contributor Author

Resolved the actionable review items in 84891fc:

  • Added an explicit credential validation error when the Qianfan API key is missing.
  • Changed topN parsing to use integer parsing and added regression coverage for decimal input.
  • Added tests for both cases; the targeted rerank test suite now has 13 passing tests.

For the other two suggestions:

  • The compressor still falls back to the original documents on rerank API errors without logging, matching the existing Jina/Cohere/Voyage rerank retriever behavior in this package.
  • I did not switch to handleDocumentLoaderOutput because this is a retriever node, not a document loader. Existing rerank retrievers format document and text outputs directly, so this keeps the node consistent with the local retriever patterns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant