NVIDIA NeMo Hosted in Astra DB Performance Study
This report compares two approaches to creating embeddings that meet performance requirements for production-level Generative AI and retrieval-augmented generation (RAG) applications, comparing latency, throughput, predictability, and cost between Astra DB with NVIDIA NeMo Retriever embedding microservice and an alternative embedding API.