Skip to content

Update Google Gemini embedding model from text-embedding-004 to gemin…#433

Closed
mariadb-CalebTerry wants to merge 11 commits intomariadb-corporation:mainfrom
mariadb-CalebTerry:docs/text-embedding-004-is-deprecated
Closed

Update Google Gemini embedding model from text-embedding-004 to gemin…#433
mariadb-CalebTerry wants to merge 11 commits intomariadb-corporation:mainfrom
mariadb-CalebTerry:docs/text-embedding-004-is-deprecated

Conversation

@mariadb-CalebTerry
Copy link
Copy Markdown
Contributor

mariadb-CalebTerry and others added 6 commits March 30, 2026 12:53
…o configurable env var exists anywhere in the codebase. Gemini 2.5 Flash's thinking/reasoning mode consumes ~478 of those 500 tokens internally, leaving only ~19 tokens for actual text output → truncated answers.

The fix is simple: Change LLM_MODEL from gemini-2.5-flash to gemini-2.5-flash-lite:

gemini-2.5-flash-lite has no thinking/reasoning tokens — all 500 token budget goes to text output
Same API key, same LiteLLM path, no other changes required
Faster and cheaper than 2.5-flash
@mariadb-tauseefkhan
Copy link
Copy Markdown
Contributor

In config.env.template, we need to have Database configuration details @mariadb-CalebTerry

@mariadb-CalebTerry
Copy link
Copy Markdown
Contributor Author

In config.env.template, we need to have Database configuration details @mariadb-CalebTerry

@mariadb-tauseefkhan Neither of my commits removed those. Looks like this commit to rebase main onto this branch brought those changes in:
c1f97fa

…o configurable env var exists anywhere in the codebase. Gemini 2.5 Flash's thinking/reasoning mode consumes ~478 of those 500 tokens internally, leaving only ~19 tokens for actual text output → truncated answers.

The fix is simple: Change LLM_MODEL from gemini-2.5-flash to gemini-2.5-flash-lite:

gemini-2.5-flash-lite has no thinking/reasoning tokens — all 500 token budget goes to text output
Same API key, same LiteLLM path, no other changes required
Faster and cheaper than 2.5-flash
@mariadb-CalebTerry
Copy link
Copy Markdown
Contributor Author

I'll abandon this PR and create a new one for the proposed changes. I think some commits were made directly against main, which I didn't have in my branch yesterday.

@mariadb-tauseefkhan
Copy link
Copy Markdown
Contributor

That's a good idea because the launch is around the corner for AI RAG. I am adding too many changes to the section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants