TechnologyApril 9, 2024

DataStax Unveils Vertex AI Extension: Empowering LLMs with Direct Astra DB Access

By enabling direct access to your Astra DB data, the DataStax Vertex AI extension empowers you to unlock the true potential of LLMs and propel your AI applications to new heights.
DataStax Unveils Vertex AI Extension: Empowering LLMs with Direct Astra DB Access

Imagine this: you simply ask your LLM (Large Language Model) to "generate a regional breakdown of user behavior for the past quarter." The LLM, leveraging the DataStax Vertex AI Extension, could independently query your Astra DB, extract the relevant data, and present it in a clear and actionable format. 

This is what developers get with our new Vertex AI Extension, a powerful tool designed to enhance your interactions with Google Gemini and other Google Cloud-hosted LLMs. It bridges the gap between your data and your LLM, enabling a more intelligent and data-driven approach to AI applications where the LLM can directly make decisions about which data sources or APIs to search for data.

Optimizing API calls for LLM success

Retrieval-Augmented Generation (RAG) has emerged as a top technique for reducing hallucinations in LLM-enabled applications. Today, the application developer needs to know what data is available, find it, and merge it with prompts that are engineered to produce useful generated outputs.  

When writing RAG applications, you’re faced with a big decision: should you employ deterministic API calls, where the choice about which APIs to call is embedded in your code, or delegate this responsibility to the LLM itself? 

Deterministic calls excel when the requirements to fulfill a user’s request are well-understood and can be readily coded into your application. However, when you want your applications to handle a wider range of requirements, you might also want to handle more ambiguous inputs, such as natural language, images, or other unstructured data. The problem is that the less clear the user’s request becomes, the more we need to employ probabilistic techniques to determine how to handle the request. This is where LLMs can shine. Their ability to analyze user requests and context can allow them to select the most relevant APIs, correctly format inputs to those APIs, and further process the outputs—possibly by calling more APIs.

The DataStax Vertex AI Extension unlocks a new level of LLM empowerment. This extension integrates LLMs such as Google Gemini with your Astra databases, enabling Gemini to directly retrieve and manipulate data stored within Astra. This empowers your LLM to become a more active participant in the logical flow of your application as well as acting on responses.

Beyond data retrieval: Response handling

For the API call decisions you delegate to the LLM, your next decision will be how to handle the response. You can instruct the LLM to handle it and leverage the richness of your data to personalize experiences, tailor creative content, and even automate intricate workflows. Alternatively, you can ask the LLM to format the data in a way that is most convenient for the next step in your response flow and return that data to the deterministic code in your app.  

This is a new way of thinking about coding your applications but it opens the door to easily and correctly handling significantly more ambiguous inputs (and API outputs) than in the past.  

Ready to experience the future of LLMs?

Vertex AI Extensions are currently under private preview, but you can delve deeper into their potential on Google Cloud. We've also built a demo showcasing the basics of the extension's capabilities, available on GitHub.

The DataStax Vertex AI extension signifies a significant leap forward in LLM functionality. By enabling direct access to your Astra DB data, it empowers you to unlock the true potential of LLMs and propel your AI applications to new heights. Join us on this exciting journey as we redefine the future of data-driven AI!

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.