April 4, 2022

Let’s Get Started with Terraform for Astra DB

Let’s Get Started with Terraform for Astra DB

DataStax has created an open-source Terraform provider for DataStax Astra DB. A provider that makes it easy to script the creation of an Astra DB database, roles, security tokens, and access lists for your database.

A provider is the translation layer delivered by a vendor that allows the provider's platform to execute Terraform engine commands. For example, Amazon AWS has a Terraform provider that translates the Terraform scripts into real actions (i.e., creating compute instances or any other AWS asset). Similarly, there are providers for Google, Azure, Oracle, and over a thousand other service providers. You can find all current providers on the registry.terraform.io page.

It's a little misleading to say the provider translates the Terraform commands. It's not like the good folks at Hashicorp (the creators of Terraform) defined a single way of expressing every idea in cloud computing. That's impossible. But they provide a framework to give you provider-specific commands from Terraform for the provider of your choice. You’ll see precisely how this works when we create the Astra database.

Terraform is very easy to use once you get your head wrapped around it. It’s not a procedural language or an object-oriented one. Instead, you create files that describe the desired end-state of your database, after which Terraform will either create or destroy it, depending on your specific commands.

Getting started with Terraform for Astra

Getting started is as simple as installing Terraform and then instructing it to implement the Astra "provider."

To install Terraform:

  1. Click on this link and select the 'Download CLI' button.
  2. Follow the installation instructions from Terraform.
  3. Add the Terraform directory to your system's PATH environment variable so you can start it from anywhere.

Once Terraform is fully installed, create a folder on your local file system for your project, then change into that new folder.

mkdir hello_astra_tf
cd hello_astra_tf

In your new project folder, create a file with the name main.tf. Technically you could name it anything you want, but main.tf is the traditional name for the file that contains the terraform{} declaration.

terraform {
 required_version = ">= 1.0.0"
 required_providers {
   astra = {
     source = "datastax/astra"
     version = ">=1.0.0"
   }
 }
}

At this point, you technically have a working Terraform project – but it doesn't do much except install the Astra provider. To run it, use the following command:

terraform init

You’ll see the following output:

jeffdavies@jdavies-rmbp16 hello_astra % terraform init

Initializing the backend...

Initializing provider plugins...
- Finding datastax/astra versions matching "1.0.10"...
- Installing datastax/astra v1.0.10...
- Installed datastax/astra v1.0.10 (signed by a HashiCorp partner, key ID …)

(remaining output omitted for brevity)

So far, our main.tf file has provided two instructions:

  1. Ensure we're running Terraform version 1.0.0 or greater
  2. Load the Astra provider for Terraform version 1.0.0 or higher

If you examine your project directory, you’ll see a new hidden directory, .terraform.

Pro Tip

When specifying versions, it's best to use the ">=" notation to ensure that you are getting the latest version (with the latest security patches, features, etc.).

Creating an Astra DB database

Let's modify our project to create an Astra DB database. But before you do, you'll need the following:

  1. An Astra DB Database account. You can register for free at astra.datastax.com/register.
  2. You’ll need to create an Astra DB API security token for the Organization Administrator or Database Administrator roles. The security token will allow Terraform to connect to the Astra API layer to execute its commands. If you’re new to Astra and unsure how to create your security token, check out this page.

Pro Tip (Security)

I'm very sensitive to accidentally sharing my security tokens, and I hope you feel the same. When I created my organization security token, I put it in a setenv.sh file using an export statement, along with the other Astra information that will be needed:

export ASTRA_API_TOKEN=AstraCS:plusawholebuncofothercharacters
export ASTRA_ORGANIZATION_ID=<YOUR ORG ID>

And then, from the command line, I set that environment variable using the:

source setenv.sh command (Mac or Linux). Then set a .gitignore rule for that file. That way, it’s highly unlikely that my security token will even end up on GitHub!

Where can I find my Organization ID?

Logging into your main dashboard is the easiest way to find your organization ID. In the URL in your browser, your organization ID is the UUID listed after the astra.database.com/ in your browser's address bar.

Define variables

You can pass information into the Terraform process using variables. In our case, we need to pass in the ASTRA_API_TOKEN and ASTRA_ORGANIZATION_ID values from our environment variables. To do this, create a file called variables.tf and give it the following contents:

variable "token" {}
variable "organization_id" {}

That tells the Terraform process a variable exists, but it doesn't define the contents of those variables. It’ll define the contents of these variables via the command line. Local variables are different, you’ll read about them in the next section.

Define resources

Terraform describes a resource as: "...one or more infrastructure objects, such as virtual networks, compute instances, or higher-level components such as DNS records."

In simpler terms, a resource is something you create. In Astra, this will be things like databases, roles, tokens and access lists.

Get started by creating a resources.tf file in your project directory. Set the content of that file to the following:

locals {
 keyspace = "helloastra"
}
# Create the database
resource "astra_database" "hello_astra_db" {
 name           = "hello_astra"
 keyspace       = local.keyspace
 cloud_provider = "gcp"
 region         = "us-west1"
}

Save the file.

As you can probably guess, we’re instructing Astra to create a database named hello_astra with a keyspace of helloastra (the value of the local.keyspace variable) on the Google Cloud Platform (gcp) in the us_west1 region. 

What isn't so easy to guess is that this creates a variable named astra_database.hello_astra_db in the Terraform process. We have provided the required fields to create the database, which creates additional information fields like the .data_endpoint_url, the base URL for REST APIs, and the .grafana_url, which is the URL to view the Grafana health charts for the database.

Now run the following command:

terraform plan -var="token=$ASTRA_API_TOKEN" \
-var="organization_id=$ASTRA_ORGANIZATION_ID" -out helloastra 

When the command finishes successfully, you can then execute the command:

terraform apply helloastra

And you’ll see the database being created:

jeffdavies@jdavies-rmbp16 hello_astra_tf % terraform apply helloastra
astra_database.hello_astra_db: Creating...
astra_database.hello_astra_db: Still creating... [10s elapsed]
astra_database.hello_astra_db: Still creating... [20s elapsed]
astra_database.hello_astra_db: Still creating... [30s elapsed]
astra_database.hello_astra_db: Still creating... [40s elapsed]
astra_database.hello_astra_db: Still creating... [50s elapsed]
astra_database.hello_astra_db: Still creating... [1m0s elapsed]
astra_database.hello_astra_db: Still creating... [1m10s elapsed]
astra_database.hello_astra_db: Still creating... [1m20s elapsed]
astra_database.hello_astra_db: Still creating... [1m30s elapsed]
astra_database.hello_astra_db: Creation complete after 1m39s [id=e57617d3-46cb-4243-a3e4-8b6e7e33c082]

Open the Astra DB web console and you'll see your database has been created:

List of the different database names s shown in the Astra DB web console.

Figure 1. The different database names

Pretty cool but minimally useful in practice. There’s little reason to use Terraform only to create databases with keyspaces. However, the database is just the top-level "container" for most projects. Real-world projects also contain roles, each of which needs a security token. You can also script the IP safe-listing using Terraform, which is where Terraform starts to shine for Astra DB.

Define a role

An Astra DB role is simply another resource type. Open up your resources.tf file and append the following text:

resource "astra_role" "hello_admin" {
 role_name   = "hello_admin"
 description = "Database administrator for the hello_astra database"
 effect      = "allow"
 # Select the resources for which we will create policies
 resources   = [
   # Identify our organization
   "drn:astra:org:${var.organization_id}",
   # Select the database we want to use

   "drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}",
   # Specify the keyspace to which we need access

   "drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}:keyspace:${local.keyspace}",
   # Select all of the tables in the database/keyspace

   "drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}:keyspace:${local.keyspace}:table:*"
   ]
 policy      = [
   # Organization level policies
   # "org-audits-read", "org-billing-read", "org-billing-write",
   # "org-external-auth-read", "org-external-auth-write",
   # "org-notification-write", "org-read", "org-role-delete",
   # "org-role-read", "org-role-write", "org-token-read",
   # "org-token-write", "org-user-read", "org-user-write",
   # "org-write", "accesslist-read", "accesslist-write",

   # Database level policies
   "db-cql", "db-graphql", "db-rest",
   # "org-db-addpeering", "db-manage-privateendpoint",
   # "org-db-create", "org-db-expand", "org-db-managemigratorproxy",
   # "org-db-passwordreset", "org-db-suspend", "org-db-terminate",
   # "org-db-view", "db-manage-region",

   # Keyspace
   "db-keyspace-alter", "db-keyspace-authorize", "db-keyspace-create",
   "db-keyspace-describe", "db-keyspace-drop", "db-keyspace-grant",
   "db-keyspace-modify", "db-all-keyspace-create",
   "db-all-keyspace-describe",

   # Table Access
   "db-table-alter", "db-table-authorize", "db-table-create",
   "db-table-describe", "db-table-drop", "db-table-grant",
   "db-table-modify", "db-table-select", 
   ]

Clearly, there's a lot of thought required for creating a role. A solid understanding of the policies and the resources is critical. That's too much information to cover in this article, but you can find out more on the main Astra DevOps page and the Manage Roles with DevOps API page. 

Note

I commented-out many of the policies for the role we're creating weren't needed. However, I left them in the file as a quick and easy reference if the role you define needs different policies. Comment and uncomment at your whim!

There are two main sections in the role – the resources and the policy arrays – that are closely related. The resources section is where you select the resources you want to set policies for. So, in our resources section:

resources   = [
   # Identify our organization
   "drn:astra:org:${var.organization_id}",
   # Select the database we want to use
"drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}",
   # Specify the keyspace to which we need access
"drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}:keyspace:${local.keyspace}",
   # Select all of the tables in the database/keyspace
"drn:astra:org:${var.organization_id}:db:${astra_database.hello_astra_db.id}:keyspace:${local.keyspace}:table:*"
   ]

Here, we're telling Terraform we want to create a role with privileges that are specific to: 

  • our organization
  • the database we are creating
  • the keyspace in that database
  • the tables in that keyspace

If you look at the Manage Roles with DevOps API page and scroll down, you’ll see a table that maps policies to specific resource types. Policies only apply to specific resources, which is a mistake you can make. Suppose you add policies to access the tables but don’t include the last resource for all tables (:table:*). You can end up table policies with no resource to which they could attach.

Define a security token

Now that we have a role, we need to define a security token users can employ to authenticate the right to use that role (and access the database and its data). While still editing the resources.tf file, append the following lines of code:

# Create a security token for our hello_admin role
resource "astra_token" "api_token" {
roles = [astra_role.hello_admin.role_id]
}

There can be any number of roles in the roles array. Think about tokens aggregating roles, not working with a single role (as we are doing in this simple example).

Define access lists

An access list defines the IP addresses allowed to connect to the database. If you don't define an access list, then ALL IP addresses are allowed to connect (assuming they have the correct security token). If you know of specific IP addresses or classless inter-domain routing (CIDR) blocks you want to allow to connect (this will still require the security token), you can define your rules using the template code below.

# Allow any IP to access the database. In practice, you should
# lock this down so only the Google Functions IP address can hit
# the database. However, this is not always practical as the 
# Google functions IP addresses are ephemeral (as are many types
# of IP addresses by many different sources).
# Even with the IP whitelisting you still need your Astra security
# token. This is just an extra layer of security.
resource "astra_access_list" "website" {
  database_id = astra_database.hello_astra_db.id
  addresses {
    # Allow any IP to connect
    request {
      address = "0.0.0.0/0"
      enabled = true
    }
  }
}

The addresses{} object can contain multiple request{} objects, so your IP safe-listing can get become very fine-grained. The address{} within each request{} is formatted as a CIDR block.

Pro Tip (Security)

The terraform.tfstate file, created by the terraform plan command, contains sensitive information and should never be stored in a public repository.

Providing interesting output

Our script generates a lot of information we’ll want to use elsewhere, namely the Astra DB security token for accessing our database using our role. To have Terraform show us the values we're interested in, open up the main.tf file and append the following lines at the bottom (outside of the terraform{} ) declaration:

# Output the security information
output "organization_id" {
 value = var.organization_id
 description = "The organization ID"
}

output "database_id" {
 value = astra_database.hello_astra_db.id
 description = "Test Description"
}

output "token" {
 value = astra_token.api_token.token
 description = "Token information - DO NOT LOSE"
}

output "client_secret" {
 value = astra_token.api_token.secret
 description = "Token information - DO NOT LOSE"
}

output "client_id" {
 value = astra_token.api_token.client_id
 description = "Token information - DO NOT LOSE"
}

output "cqlsh_url" {
 value = astra_database.hello_astra_db.cqlsh_url
 description = "CQL Shell URL"
}

output "graphql_url" {
 value = astra_database.hello_astra_db.graphql_url
 description = "GraphQL URL"
}

output "data_endpoint_url" {
 value = astra_database.hello_astra_db.data_endpoint_url
 description = "Data Endpoint URL (REST API)"
}

output "grafana_url" {
 value = astra_database.hello_astra_db.grafana_url
 description = "Grafana URL"
}

### These commands are not displayed after the "apply"
output "replication_factor" {
 value = astra_database.hello_astra_db.replication_factor
 description = "Replication Factor"
}

output "node_count" {
 value = astra_database.hello_astra_db.node_count
 description = "Node Count"
}

output "total_storage" {
 value = astra_database.hello_astra_db.total_storage
 description = "Total Storage (GB?)"
}

output "hello_admin_api_token" {
 value = astra_token.api_token.token
 description = "Generated API token for the hello_admin role"
}

output "hello_admin_client_id" {
 value = astra_token.api_token.client_id
 description = "Client ID (aka username) for the hello_admin role"
}

output "hello_admin_client_secret" {
 value = astra_token.api_token.secret
 description = "Secret (aka user password) for the hello_admin role"
}

Watching it all work together

We've done a lot of editing. Now let's run our project to see the fruits of our labor. Execute the commands:

terraform plan -var="token=$ASTRA_API_TOKEN" -var="organization_id=$ASTRA_ORGANIZATION_ID" -out helloastra

terraform apply helloastra

Destroying what you have wrought

Terraform also lets you delete/destroy what you create. After successfully performing an apply action in Terraform, you can execute the delete action and destroy everything created by the apply action.

Destroying everything you created is handy when you're first developing your scripts and learning Terraform. The command to destroy/delete your Astra database with all of its roles and security tokens takes the following form:

terraform destroy -var="token=$ASTRA_API_TOKEN" \ -var="organization_id=$ASTRA_ORGANIZATION_ID"

How do I upgrade my providers? I want the latest version!

The file .terraform.lock.hcl "locks" in the provider versions last used. If you want to update all your providers to the current version, delete the .terraform.lock.hcl file and run the terraform init command again.

What the Astra DB provider will not do

The Astra provider is great at what it does: creating databases, roles, security tokens, and access lists. In its current form, that's all it does. It does not create tables or populate data. To do this, you still need to use traditional methods.

Hopefully, the veil of mystery that surrounds Terraform – and the Astra provider more specifically – is now removed. The Astra provider is under active development, so be sure to check for updates often. You can find the main Astra DB provider page in the Terraform Registry, which has complete instructions to guide you through all of its capabilities.

Ready to go deeper or contribute to the open-source project? Visit the Astra Provider GitHub page. You can also join the DataStax Developer's Discord channel to learn more. 

Resources

  1. Github Code
  2. Manage roles with the DevOps API
  3. Terraform by HashiCorp
  4. DataStax Astra DB
  5. Register for an Astra DB account
  6. DataStax Community Platform
  7. DataStax Academy
  8. DataStax Certifications
  9.  DataStax Workshops

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.