Chroma Context-1: The 20B Retrieval Model That Matches GPT-5 at a Fraction of the Cost
The most expensive part of your AI agent stack might not be what you think. While developers obsess over model selection and prompt engineering, retrieval is quietly eating your latency budget and your inference bill — and most production RAG pipelines are using general-purpose LLMs for a specialized task they weren’t built for. Chroma’s new Context-1 model is a direct challenge to that pattern. It’s a 20-billion-parameter open-source retrieval model that outperforms GPT-5 on HotpotQA and FRAMES benchmarks while running 10 times faster and costing 25 times less per query. Released on HuggingFace under an open license, it’s purpose-built for one thing: getting the right information out of large corpora for RAG pipelines and agent memory workflows. ...