Building a Generative AI Quiz with Astra DB and Langflow
While we love a bit of trivia, we might be better versed in some topics than others: perhaps we like cars and engines more than we do baking methods and cooking times. What if we could visit any website and test our knowledge of all the content there?
In this post, we’ll look behind the curtain of an application we recently built that can turn any website about any topic into a fun multiple-choice quiz with generative AI using Astra DB and Langflow.
Before we proceed, you might want to play with the demo yourself to understand what we’ll be building.
The process
Most GenAI applications employ a technique called retrieval-augmented generation (RAG) to provide real-time, high-accuracy context to generative large language models. RAG requires getting accurate up-to-date information that is similar to a query from a user.
For our quiz application we will use the following technique:
- Visit the target website
- Read its text content and break it into chunks
- Store those chunks of text in an Astra DB collection
- Query the collection with a vector query for “interesting fact information”
- Retrieve chunks sorted for similarity to our “interesting fact information” query
- Augment our LLM’s generation with this information by including it in the prompt
- Generate JSON to be consumed in a frontend user interface
This entire flow can be modeled with Langflow, and then also served via API from Langflow. Let’s explore how we might do this.
Langflow
To get started, visit DataStax Langflow and create an account if you haven’t already. From there, create a new project with a blank flow.
Great, now we’re ready. For your convenience, we’ve included the flow we’ll be working with here; you can download it as a JSON file. Let’s import this into Langflow by clicking the name of our project, then Import, then choose this file. You can watch how we do this below. Once we’re done, we’ll explore each step in detail.
Once imported, we can see each step represented in our flow. Let’s walk through each of them:
- Text input - This is the URL that we will crawl.
- URL - This component visits the provided URL and gets its text content.
- Collection name sanitizer - Astra DB collections are considered valid if they contain underscores and a small number of special characters. This component creates a valid collection name from the URL.
- Split text - This component receives the page contents from the URL component (2) and breaks it down into tweet-sized chunks.
- OpenAI embeddings - This component uses an embeddings model to convert the page contents (4) into vectors. It is reused when querying for interesting facts to construct the quiz. It’s important to add your OpenAI API key here.
- Astra DB - This component represents our database that stores a page’s text content. In this component, it’s important to choose or create a database for this project. This component is used for both storing the vectorized chunks of data from the page as well as then searching against those vectors to return a search result.
- Parse data - Converts the array of records retrieved from our database query into text.
- Prompt - Converts the text into a RAG prompt, preparing to deliver it to OpenAI (9).
- OpenAI - Uses an LLM and our RAG prompt to generate JSON output for our quiz. It’s worth noting that this component uses “JSON mode” visible in advanced settings to ensure JSON output. It’s important to add your OpenAI API key here as well.
- Chat output - Serves the response as JSON.
This represents the logic flow across the entire application. Let’s take it for a spin by clicking Playground.
Perfect! It’s giving us the JSON we need. Now, let’s make this available over the internet via API. We can do this by clicking the API button.
From here, we can generate an API token, and then copy the query in a language of our choice that can run against any frontend application.
Putting it all together
We did it! Using a single tool, Langflow, we created a full end-to-end pipeline that takes any website and turns it into a quiz using Generative AI. We even used Astra DB, but we used it via Langflow, which served as a one-stop shop for our entire GenAI workflow.
We hope this post has made it clear why we’re so bullish on these technologies as we continue to democratize AI for all developers; we’re eager to see what you build. To explore this example in full, we’ve created a GitHub repo that contains the Langflow JSON file, but also a user interface that is connected to it.
We’ll let you take it from here and can’t wait to see what you build. Happy coding!