TechnologyDecember 13, 2023

It’s Gemini Pro API Day, So the Astra Assistants API Is Getting Vertex AI Support!

It’s Gemini Pro API Day, So the Astra Assistants API Is Getting Vertex AI Support!

TL;DR – The Astra Assistants API, a service that’s API-compatible with OpenAI’s Assistants API, now supports Google’s new Gemini Pro model. If you're just here for the Astra Assistants API Vertex AI sample code, here it is.

Last week, Google announced the launch of their Gemini model. But for developers, today’s the day the rubber hits the road, as Google made Gemini Pro available via API (as part of Vertex AI), to enable you to get started building with this powerful new set of tooling and infrastructure.

As soon as Google announced Gemini, we knew we needed to get Vertex AI added to the Astra Assistants API, a service that is API-compatible with OpenAI's recently introduced Assistants API (here’s a post to help you get started with the Astra Assistants API).

Currently the Astra Assistants API supports many LLMs and providers in addition to OpenAI including Cohere, AWS Bedrock, Perplexity, Anthropic Claude, Llama, and Mistral. However, support for Google's Vertex AI had evaded us because Google Cloud bars users from logging in via API keys; it requires an application credential file, which makes this integration tricky.

We came up with a creative solution to this problem by leveraging the existing file storage capability in the Assistant API and adding a new auth purpose:

file = client.files.create(
    file=open(
        GOOGLE_JSON_PATH,
        "rb",
    ),
    purpose="auth",
)

Note: the file only needs to be uploaded once, and its content will get stored in plain text in your Astra database, so make sure you secure your database appropriately and keep your Astra API keys safe.

Having uploaded the file, users can now reference it by ID in the google-application-credentials-file-id HTTP header when they create their OpenAI client as follows:

client = OpenAI(
    base_url=base_url,
    api_key=OPENAI_API_KEY,
    default_headers={
        "astra-api-token": ASTRA_DB_TOKEN,
        "embedding-model": "textembedding-gecko@002",
        "vertexai-project": GOOGLE_PROJECT_ID,
        "google-application-credentials-file-id": file.id
    }
)

Note: to use the Vertex AI, you must also pass a vertexai-project header with your project ID and, as with other LLM providers, you should also pass an embedding-model for file retrieval / RAG.

At this point, you can go ahead and run completion using Vertex AI models like Bison or Gemini Pro.

model="chat-bison"


prompt = "Draw an ASCII art kitten eating icecream"
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "system", "content": "You are an amazing ascii art generator
 bot, no text just art."},
        {"role": "user", "content": prompt}
    ]
)

print(f'prompt> {prompt}')
print(f'artist-{model}>\n{response.choices[0].message.content}')

Learn more about Astra DB vector search, and to get started building with Gemini Pro and Astra DB, check out the full example at the astra-assistant GitHub repo.

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.