State-of-the-art text embedding via the Gemini API (via) Gemini just released their new text embedding model, with the snappy name gemini-embedding-exp-03-07
. It supports 8,000 input tokens - up from 3,000 - and outputs vectors that are a lot larger than their previous text-embedding-004
model - that one output size 768 vectors, the new model outputs 3072.
Storing that many floating point numbers for each embedded record can use a lot of space. thankfully, the new model supports Matryoshka Representation Learning - this means you can simply truncate the vectors to trade accuracy for storage.
I added support for the new model in llm-gemini 0.14. LLM doesn't yet have direct support for Matryoshka truncation so I instead registered different truncated sizes of the model under different IDs: gemini-embedding-exp-03-07-2048
, gemini-embedding-exp-03-07-1024
, gemini-embedding-exp-03-07-512
, gemini-embedding-exp-03-07-256
, gemini-embedding-exp-03-07-128
.
The model is currently free while it is in preview, but comes with a strict rate limit - 5 requests per minute and just 100 requests a day. I quickly tripped those limits while testing out the new model - I hope they can bump those up soon.
Recent articles
- What's new in the world of LLMs, for NICAR 2025 - 8th March 2025
- I built an automaton called Squadron - 4th March 2025
- Notes from my Accessibility and Gen AI podcast appearance - 2nd March 2025