BLOG | Technology

Introducing Astra DB Hybrid Search: Improve Relevance by 45%

Astra DB Hybrid Search boosts relevance with NVIDIA NeMo Retriever Reranking microservices
Updated: March 18, 2025 · 3 min read
Introducing Astra DB Hybrid Search: Improve Relevance by 45%

Retrieval is the most important part of RAG for driving accuracy in production AI. Whether you're building AI-powered search, recommendation engines, or personalization for your end users, your retrieval-augmented generation system’s ability to retrieve the right information directly impacts user experience and accuracy. Poorly ranked results lead to irrelevant answers that frustrate your end users, ultimately killing your generative AI app. That’s why we’re excited to introduce Astra DB Hybrid Search, now enhanced with server-side reranking using the NVIDIA NeMo Retriever reranking microservices - built with NVIDIA NIM, part of the NVIDIA AI Enterprise software.

Why hybrid search?

Hybrid search combines vector search (for semantic understanding) and lexical search (for exact keyword matching) to ensure the best possible results. While vector search helps understand context and meaning, lexical search ensures critical keyword matches aren’t overlooked. Blending these signals effectively requires intelligent ranking—and that’s where the NVIDIA NeMo Retriever reranking microservice comes in. 

How it works

With this release:

  • Astra DB performs hybrid retrieval, combining lexical search using BM25 and vector search powered by Apache Cassandra®.

  • The top results are passed through the NVIDIA NeMo Retriever reranking microservice, which reorders them based on fine-tuned LLM models, significantly improving relevance.

  • You can build hybrid search-based applications with the Astra DB Python client supported by the schema-less, document-based Data API or Langflow's Astra DB component.

  • You can achieve up to 45% improvement in search relevance, ensuring GenAI applications return the most accurate responses.

Key benefits

Seamlessly integrated and hosted by Astra DB

Astra DB hosts the NVIDIA Reranking service for you. Your data stays within Astra DB and reranking is automatically enabled when you need it. You don’t have to configure your reranker; you can use it out-of-the-box on Astra DB.  

Significantly improved relevance

By incorporating NeMo Reranker, we deliver state-of-the-art relevance ranking, ensuring that the most meaningful results surface first. This improves relevance by an average of 18.5% up to 45.07%.

Relevance comparison between vector only and Astra DB Hybrid Search with NVIDIA NeMo Retriever Reranking

Use hybrid search with developer-centric Python client and Data API

Here’s an example of how you can access hybrid search from your code via the Astra DB Python client, supported by our schemaless, document-based Data API:

my_collection.insert_one({
"_id": "1" 
"$hybrid" : "this text is ready for hybrid search retrieval"
})
results = my_collection.find_and_rerank(
    {},
    sort={"$hybrid": <hybrid search query>},
    limit=10, # no. of results returned
    hybrid_limits=30, # no. of results for the reranker to work with
 ) 

Easily build GenAI apps with Hybrid Search on Langflow

Developers can quickly experiment with Astra DB Hybrid Search using Langflow, a visual builder for GenAI apps and agents. You’ll improve your search results quality by using Astra DB Vector with hybrid search enabled.

Using Hybrid Search in Langflow

Astra DB can support other rerankers as well. You can choose the server-side hosted NVIDIA NeMo Retriever reranking microservice, any other reranker component in Langflow, or you can bring your own reranker through the custom component. We plan to expand support to other state-of-the-art rerankers server-side soon. Stay tuned.

Get started

Ready to take your GenAI apps to the next level? 🚀Astra DB Hybrid Search will be available on Langflow and in the Data API in April. Sign up to be one of the first to try it and register for the April 16 webinar to take a deep dive into how Astra DB Hybrid Search works.

More Technology

View All