DataStax and Google Cloud Collaborate to Evolve Open Source Apache Cassandra for Generative AI
Companies everywhere are looking for ways to integrate AI into their business as popularity of generative AI technology continues to surge. While organizations need to think about how to implement AI into their business, they also need to think about what is needed to support their AI infrastructure. Large language models (LLMs), AI assistants, and real-time generative AI require a database that can handle the scale, performance, security, and unique requirements of these new, generative AI applications.
In the age of AI, Apache Cassandra® has emerged as a powerful and scalable distributed database solution optimized for the unique demands of generative AI. With its ability to handle massive amounts of data and provide high availability, this popular open source database has become a go-to choice for many of the most widely adopted and demanding AI applications including Uber, Netflix, and Priceline.
We are working closely with the Google Cloud AI/ML Center of Excellence as part of the Built with Google AI program to enable the best of Google Cloud’s generative AI offerings to enhance the capabilities and experience of customers. And, in collaboration with Google Cloud, we are working to further evolve Cassandra to be an even more effective database for AI-powered applications.
Together, we’ve co-developed several significant new capabilities to further Cassandra and Astra DB as the database of choice for AI applications:
CassIO
The CassIO open source library makes it easy to add Cassandra into popular generative AI SDKs such as LangChain. The new integration has several key features to enable building sophisticated AI assistants, semantic caching for generative AI, LLM chat history, Cassandra prompt templates, and a new Google Cloud Gen AI integration.
Google Cloud BigQuery Integration
This new integration enables Google Cloud users to seamlessly import and export data from Cassandra into BigQuery straight from their Google Cloud Console to create and serve ML features in real-time. Learn more here.
Google Cloud DataFlow Integration
This new integration pipes real-time data to and from Cassandra for serving real-time features to ML models, integrating with other analytics systems like BigQuery, and real-time monitoring of generative AI model performance. (code)
To further extend Datastax support of generative AI capabilities, Datastax has just launched a new vector search tool in DataStax Astra DB, our popular database-as-a-service (DBaaS) built on Cassandra, exclusively on Google Cloud first, as well as other new features via a NoSQL copilot – a Google Cloud Gen AI-powered chatbot that helps DataStax customers develop AI applications on Astra DB.
“By integrating Google Cloud’s generative AI capabilities into Astra DB, DataStax is adding natural language capabilities into a suite of already powerful database capabilities, and giving customers a complete and unified data and AI solutions approach,” said Stephen Orban, VP of Migrations and GenAI Ecosystem at Google Cloud.
“Vector search is a key part of the new AI stack; every developer building for AI needs to make their data easily queryable by AI agents,” said Ed Anuff, chief product officer, DataStax. “Unlike many other vector databases, Astra DB is not only built for global scale and availability, but supports the most stringent enterprise-level requirements for managing sensitive data including HIPAA, PCI, and PII regulations. It’s therefore an ideal option for both startups and enterprises that manage sensitive user information and want to build impactful generative AI applications.”
The new vector search feature in DataStax’s Astra DB is available starting today as a public preview exclusively on Google Cloud. Developers can get started immediately, by signing up for Astra DB.
Want to go deeper with vector search, LLMs and GenAI? Join us on July 11, 2023, for a free, two-hour virtual GenAI summit for architects and practitioners: Agent X: Architecture for GenAI. We'll unpack and demonstrate how you can craft inspiring AI agents and GenAI experiences with your unique datasets.