TechnologyOctober 26, 2020

[Webcast] Astra on GCP: Spring into Action and Get Up and Running In Minutes with Cassandra

Matt Kennedy
Matt KennedyProduct Strategy, DataStax
Christopher Bradford
Christopher Bradford
 [Webcast] Astra on GCP: Spring into Action and Get Up and Running In Minutes with Cassandra

We’ve recently provided you with an intro to Astra, and how to migrate open-source Cassandra to DataStax Astra. What’s next? It’s time to spring into action with hands-on technical learning and fundamentals to get you up and running in minutes with Cassandra.  

Join us in this recorded webcast where we discuss:

  • Using REST & GraphQL for no-code middle tiers.
  • Spring Boot
  • Quarkus

Roll up your sleeves and give Astra a try. Sign up for the free 5GB Astra tier and start building today!

Introduction

Matt Kennedy (00:00): Good morning, and thank you for joining us for today's Webinar, Spring into Action using Astra on Google Cloud. My name is Matt Kennedy, I'm a product manager here at DataStax and I'm joined today by Chris Bradford, another one of our PMs who works on our Kubernetes operators. So, today we are going to cover the Google Kubernetes Engine better known as GKE, and we are going to talk about that in the context of the Cassandra operator, and then we're going to do a little bit of exploration of our new DevOps API and how we can use that in concert with our other data APIs to use Astra basically completely outside of the context of the Astra control panel. And finally, we'll take a look at a sample app that helps people get started with Spring and Astra, which you can launch into a web-based IDE directly from the Astra console. So all of that said, let's dive right into Kubernetes. Chris, I will hand it over to you.

Chris Bradford (01:01): Yeah, thanks, Matt. So like Matt said, I'm Chris Bradford, and I'm the PM for the Kubernetes operator from DataStax. And so, we really want to talk about the history of this and also kind of what it does. So first and foremost, we start with the Astra interface. And, you can see how it's a dashboard and it's a platform as a service, or Cassandra as a service as we like to call it. But what is behind that curtain? What is underneath this chrome, this veneer? And, really the core is GKE along with the cass-operator, or the DataStax Kubernetes Operator for Apache Cassandra. But it's all woven together. It's kind of neat though, where this project started. We knew that we needed to be agile and we needed to be cloud native, and an operator logically made sense. We were very keen to keep the role of cass-operator solely focused on managing Cassandra clusters. And the reason we did that is we said, "You know what, other people would probably want to see this too." So we had an eye on making it available as we started going down this path.

Astra & GKE - The Machinery Powering it All

Chris Bradford (02:11): So, let's talk about what an operator is and what it does. So in the instance of cass-operator, it handles a number of automations. It can handle the provisioning of a cluster, it handles replacing nodes when they go down, actually replacing nodes instances, and taking care of the day-to-day operations as well as just simply spinning up a cluster. So like I said, provisioning replacement, scale up, scale down, automatic security, deploying configuration changes, restarting nodes across the cluster, and handling version upgrades.

Chris Bradford (02:46): Today we're going to talk about a cluster that spans three availability zones or failure domains, and has everything placed within a single namespace. Here you can see six worker nodes across three failures zones, and the cluster level is our Kubernetes cluster. So, first things first we have to install the operator. And, it looks like there's a number of steps involved but really it's two commands. We use helm. You add the helm repository and you say, "Helm install cass-operator." That's the name of this installation as well as the repo to use, which in this case is datastax/cass-operator.

Chris Bradford (03:33): And, what this will do is it'll set up... Well, it doesn't set up a namespace with helm. If you are doing this by hand you can specify an instance or you can specify the namespace as part of the helm command. It sets up a secret to hold the webhook TLS certificates. It creates a custom resource definition that defines a logical Cassandra Datacenter. We are going to talk about that more here in a few minutes. A ClusterRole, a ClusterRoleBinding, and if we drop down a couple more there's a Role and a RoleBinding. And, these are permissions that the operator needs to perform things automatically within the Kubernetes cluster. Finally, there's a service that sits in front of the operator's webhook endpoint, a deployment which controls how many instances of the operator are running and then a ValidatingWebhookConfiguration. And, this is a really interesting feature. If you try to submit a change to the operator that it thinks shouldn't be done, it can actually cancel that change instead of putting your cluster into a state that is either unknown or less than ideal.

Install Cass-Operator

Chris Bradford (04:38): First things first, when we say, "Install the operator," a number of these resources are created. We just went over those. And, we use a Kubernetes deployment and we say, "Hey, Kubernetes, we want to have at least one instance of cass-operator running at all times." Kubernetes sees that that instance is not running. It will then spin up an instance of the cass-operator on one of the worker nodes and it's good to go. It's ready to start receiving requests.

Cassandra Datacenter Custom Resource

Chris Bradford (05:03): Now, let's talk about that custom resource that I mentioned just a minute ago. So, a custom resource is a way to describe something interesting inside of Kubernetes. You're already used to working with regular Kubernetes resources like pods, services, deployments, those kinds of things. But there are ways to extend that API. And so, we've extended that API with the concept of a logical Cassandra Datacenter.

Chris Bradford (05:27): A logical Cassandra Datacenter is made up of a number of nodes, racks, which usually equate to ACS and cloud terminology, the type of nodes that you want to deploy, whether that's open source Cassandra or DataStax Enterprise, the Cassandra version, storage information such as the amount of disk available per node, and any configuration tweaks you want to do. So out of the box, we ship a default Cassandra configuration, and it's tuned a little bit to operate inside of Kubernetes. But if you want to override any of those parameters you can. And, that happens here inside of this custom resource. Finally, you take this YAML that you see here on the right hand side of the screen, push that into a YAML file on disk and you submit it to the cluster with kubectl.

Chris Bradford (06:14): Now, when you do that, you've basically described the end state of the cluster. You aren't choosing a particular pod or setting up StatefulSets, anything like that. The operator handles setting that up for you. So let's talk about some of the terms here though, because you might be familiar with the Cassandra node and haven't heard of pods. You might know what a pod is and not a Cassandra node. So in this case, a Kubernetes pod is a collection of containers that need to run together to perform some sort of service. In this case, the service is a Cassandra node. In our environment, a Cassandra pod is equal to three containers. The first container is called the config-builder. You'll notice that that's in a dashed box up at the top. The config-builder's job is to take those configuration overrides that you specify inside of the custom resource, and renders them out the disk as the configuration files that Cassandra already understands. So, that is run before any of the other containers to started.

K8s StatefulSet == C* Rack

Chris Bradford (07:09): Next we have the Cassandra container. And, you'll note that it says Cassandra with management API. We'll go into that here more in a second, but it's two processes inside of a single container. And then finally, we have this third container which is just a simple busybox container that runs the tail command on the log file for Cassandra. So, if you want to see the log output with kubectl logs, that works. And, we have a persistent volume claim and a persistent volume as well, which we'll discuss here as we move forward. Now, inside of Cassandra there's also the concept of racks. And, racks are used to define failure zones. So if an entire rack goes down, we want to have replicas of your data instead of other racks. So, in this case we leverage something called a StatefulSet. StatefulSet inside of Kubernetes allows a collection of pods to be provisioned together, to have persistent identity and storage, and you can scale them in a specific way. So in this case, a single StatefulSet represents a Cassandra rack.

C* Container

Chris Bradford (08:13): So back to that Cassandra container though, I mentioned that it was the Cassandra process and the management API. So, one of the things that we did when we started working on this is we said, "We need a little bit more out of Cassandra to make it cloud native, to make it play nice inside of a container space." So, we created this thing called the Management API for Apache Cassandra. This is a simple Java process that gets started when the container comes up. It's baked into every DataStax container image, but usually it's not enabled unless you set specific environment variable or you say, "Enable management API." Well, then it will start that instead of just starting Cassandra. But what this does is instead of exposing a bunch of JMX endpoints, or a single JMX endpoint with a bunch of commands that can be run across it, now we have a REST API that can be used to get status of the cluster, of the node, do things with configuration lifecycle management such as starting, stopping, and joining the node, managing keyspaces, managing users, all of this over a RESTful interface that's secured with TLS.

Chris Bradford (09:15): Additionally, when you start Cassandra, we would then start the Cassandra Daemon and then all those other functionality will happen across a Unix socket local to the pod. Now, this gives us a lot of flexibility and power because we can say, "Hey, I want you to, for instance, provision all of my machines, all of my containers, but don't start the processes yet. I want to start from that particular order." So here we've said, "Hey, Cassandra operator, I want a six node cluster and I have three racks or the availability zones. Let's go ahead and deploy across those racks." cass-operator says, "All right, cool. I can do that." And, you'll note here we show that being on the first worker node. The pink boxes are the logical racks and they're signified by a StatefulSet. And then, each worker node is also... But there's also a seed and a list of all nodes exposed as services.

Create StatefulSets

Chris Bradford (10:18): So the first thing we do is we say, "Hey, Kubernetes, make my StatefulSets." Kubernetes creates a StatefulSets, but then it runs its own reconciliation loop and says, "Oh, these StatefulSets don't have any pods and I need two pods per StatefulSet. Let me go ahead and schedule those." So, here the pod startup and they're showing as being in red because they're not fully online yet. The container is running, the pod is running, but it's just running the management API. Cassandra hasn't actually started yet. And when the operator sees this state it can say, "Oh, hey, all of the nodes that I'm looking for are there, they're just not running it. Let me go ahead and take the next steps." And now, cass-operator reaches out over that management API through that REST interface, and secured with TLS, and starts to tell the nodes to start one at a time.

Chris Bradford (11:04): So, here we see this node. We send the command say, "Hey, go ahead and start up." And, it starts its process. And, it has to bootstrap. It defines the state of the brain which in this case is empty, and over time it'll eventually start up. And when it finishes starting, the cass-operator says, "Hey, now I can move on to this next node." So, this node is green. And, if you look at the ready state inside of kubectl, let's say, "2/2," it moves on to the next pod. And, that process continues around the cluster until all nodes are up and available.

Configuration Change

Chris Bradford (11:42): Now, what if we want to make a change to our cluster? So here I'm asking the operator to change the version of my Cassandra cluster from 3.11.6 to 3.11.7. So, we go back to that YAML file that we started with when we changed the six to a seven. We say, "Kubectl apply a closure of the cluster." And the cass-operator says, "Wait a second. All of these pods that already exist are not running the same version as what's been requested of," because we described the in-state. So it says, "Okay. What I'm going to do is I'm going to terminate the first pod." And when I terminate that pod, Kubernetes then sees, "Hey, I have a pod that should be running here and it's missing. Let me go ahead and bring it back up." And, it brings it up with the new version. Management API sees that a new pod is running, and it tells it to go ahead and start, and then it's back online. And from there, it rolls around the cluster and continues with this process.

Chris Bradford (12:47): The same process can be said for changing a configuration value. And that's, if I make it change to Cassandra.yaml, if I enable additional security options, it follows that same process. And, that wraps up just a quick overview of the management API and cass-operator. If you have any questions about it, please feel free to type them into the Q&A widget and I'll be here to answer those questions when my colleague, Matt, has finished with his demos. Thank you.

Using Astra APIs

Matt Kennedy (13:19): Awesome, Chris. Thank you. And, we did get a few in so we'll cover those answers at the end. Now, moving on to using Astra APIs. So in a prior webinar, we have gone over how to use the GraphQL and the REST endpoints that get auto-generated from the data in your table. What we're going to show this time is what we call the DevOps API. And, that is an API that rather than addressing the data in your database, the DevOps API allows you to control Astra itself. In other words, the things that you would do pointing and clicking in the user interface, the DevOps API allows you to script. And in order to do that, you're going to have to activate a service account for your account. So, you will find that in your organization menu.

Activating the Service Account

Matt Kennedy (14:13): So, a lot of the times when you log into Astra, you'll be taken to the dashboard or you'll be taken to a create database screen. In this case, what you want to do to find this service account is go to the organization menu at the top, click the organization that you want to activate the service account on, and then scroll all the way to the bottom, and you'll click the button that says, "Add service account." It's as simple as that, really, really straightforward. So that said, let's jump right into the demo and see how this all gets used together along with the data APIs.

API Demo

Matt Kennedy (14:48): This demo is going to show how to create an Astra database and add a table to it, and then fetch data from it. But we're going to do it all without really using any of the user interface that we have in Astra. So, the first thing we're going to do is get to an organization. So I've just logged in, and you can see I have some organizations here. I am going to choose this personal account that I have, and it says I don't have any databases yet. Ordinarily, I would go to the add database dialogue and create a database there. Instead, what I'm going to do is use our DevOps API to create that database. And in order to do that, I need a service account. So, the first step is we're going to come down here to the bottom of this page and activate a service account. I'll click the add service account button. I have a slide out here that says, "Create a service account to manage your database using the Astra DevOps API."

Matt Kennedy (15:49): I'm going to take this opportunity to open that in a window and let our docs load up, and I'll switch back to the other tab, and I will click add. Okay. So, now I have a service account here. You'll see that I've got a client ID, and my client name, and it's an active account. So, what I'm going to do now is copy the credentials for that account. Okay. So, that's going to put a JSON object into the clipboard basically. And, I'm going to use that in a minute. But what I want to show you right now is our documentation for this API. So, there's two steps really to using this. First, I have to create a token that I will get from authenticating into the service using that service account, JSON object that I just talked about. And then, I have operations I can do which really run the gamut of almost everything you can do, pointing and clicking in the user interface, you can now do with a API call.

Matt Kennedy (16:53): So, we're going to go into detail on how to use those API calls here in just a minute. Okay. Before we start making calls, I'm going to set a few environment variables in my terminal here. So, all I've done is I've set an Astra username, an Astra password which is setecastronomy, and a string for that service account. So, what I did here was I pasted in the service account, JSON object, and then just so the quotes don't disappear when I assign it to the variable, I go and escape the quotes. Okay? So now if I do echo $SVCACCT, you will see that it is correctly quoted the way a JSON object needs to be if we're going to process it as an API call. And, now I can use that variable in my curl commands that I'm going to run. Make sense? Okay, cool.

Matt Kennedy (17:51): I have pasted in the curl command that we use to do the authentication of the service account. This is literally just copied and pasted straight from the doc site with the exception of this -i, which I include because I like to see the HTTP headers back from the curl. Sometimes there's some extra information in there especially if you hit an error. That can be really handy to have. I'm going to add a data section that has that service account in it. So, in order to do that we will say... We'll include our service account object here. And now, I hit return, and I should get a bearer token back. So, indeed you see that I have an HTTP 200 for okay. And, in fact I have this token here, which means that our authentication attempt was successful.

Matt Kennedy (18:45): Now, I am going to grab this and copy that to the clipboard, and I am going to go ahead and put that into a variable which we will call our DevOps API token, DOA token. Why not? Okay. So, we'll paste that in and now we've got that. So, I'm going to clear my screen just to make things a little bit more legible, and then let's pop back to the documentation page.

Matt Kennedy (19:15): So, here we are on the documentation page for the create a new database call. It's a pretty simple little three lines of curl, but we need more than that. And, this is where the doc site helps us with the commands that we need to execute. So, I'm going to give the database a name in this field here. So, we'll call it DevOps for the win. And you'll note that when I did that, we added the JSON that is required for the call to be successful into the curl that we're then going to execute. But let's fill out the rest. So, we need a keyspace name which we will make ks1. We're going to use GCP for this demo. We're going to be using the developer tier. It has a single capacity unit. We're going to use us-east1. And then, remember we set that username to MKADM, and we used setecastronomy for our password. Let's make sure I spelled that okay. Yup.

Matt Kennedy (20:18): Let's look at the command. So, we have all of our parameters in here. What we don't have is our bearer token, but that's okay because I can add that by just pasting that in here. So once that's in there, I can go ahead and say, "Try it." And in fact, I got a 201 created back, and I can quickly go and check in the Astra console that this is actually a database that now exists or is being created. So now that we're back at the Astra console, you can see that I do in fact have a DevOps database. Status is pending. That just means it is in the process of creation, which shouldn't surprise us because we just issued the create command. So, we'll give that a minute to finish creating and then we will go back to our terminal and do some more actions with the API.

Matt Kennedy (21:15): Here we are back at the docs page. Now that I have a database up and running, I can get information about that database. So, I have a configured command here. I see that I get a JSON object back, and this has information about my database. Now, I'm going to point out this data end point URL. This is the next thing that we're going to be using because this is the REST endpoint. So if I authenticate to the REST endpoint, and this is the data endpoint. If I authenticate to that REST data endpoint, I can connect to my database and create a table, and put some data in it. And, I can do that all without using any CQL which is a little unusual for Cassandra database, but it's a pretty cool capability of Astra. So, let's see how we do that.

Matt Kennedy (22:06): Okay. So I am back to my trusty terminal, and I took that URL that was there for that data endpoint and I just pasted it in. I'm assigning it to another variable. We're going to use that in a curl command. So, let's see what that command is going to look like. So, I'm not going to go through the docs for this particular call because that's been done in a prior webinar. Happy to refer you to that if you go ahead and reach out. What we're going to do here is authenticate to that data endpoint. And to do that, because it's to a specific database, we're going to use that Astra user and the Astra password. The other interesting thing in here is we're using this uuidgen command to make sure that we have a unique ID to identify this request.

Matt Kennedy (22:58): So let's go ahead and hit return, and we should get a token back. And, we do. So I'm going to again copy this token, and I will say... This time I'm going to say, "Export token equals that." And now I've got my token assigned to a variable, and I can go ahead and interact with the REST service via the convenience of curl. So, the first thing that we're going to do is create a table. And in order to create that table, we have a little bit of a lengthy command. So, I'm going to paste that in from a text document. We're going to go through how you would create this if you were doing it from scratch. So, obviously you can look at our samples. This top set of headers looks pretty familiar, right? We've got our uuidgen in here, and here we have another header that just passes in that token that we were using a minute ago. But now we've got this big data section which again is a big blob of JSON. And, we need to know what we could put in there.

Matt Kennedy (24:05): So I'm going to go ahead and run this, but I'll show you how we get this by using the documentation site. Okay. Great. So, that's succeeded. I'm going to jump back over to the doc site and show you how you can figure out what goes into that JSON object that we need to create to create a table. Right. So, note that on the left-hand nav, I have switched which API I am looking at. I am now looking at the DataStax Astra data API, and I am going to go under tables. And I'm going to say, "Add a table." Right? Because that's the command I just executed. So, you'll note that there's a little bit of that JSON object created but not the whole thing. So, what I need to do is fill out these values in here. And if I do that, I don't need the path params because I'm using that URL that we copied from the data API URL, from that JSON we got back. But what I do need is to add some column definitions here.

Matt Kennedy (25:14): So to add a column definition, I need a name for it. And this, I'm just going to use name and we'll make it text. This is not static. And then, we will add another column. And, this will also be text. We're going to have a person name, and then a fact about the person, and then we'll say, "Hey, when did we learn that fact?" And, that'll be a date. Okay? In order to make a real table we need a primary key. And so, that primary key will be name. We'll just pretend that that's a unique identifier for this particular use case. And when we go back up here, we see that we now have a JSON object that includes all of those column definitions that we need to have that. Now, we've already run this so that table does exist. I'm going to cheat a little bit and go back to that database here, and show you that it does in fact have that table created.

Matt Kennedy (26:18): So we'll go into CQL console, and I'm going to log in. And I'll enter my password, and then I'm going to use ks1. And, let's look at what's in there. So, we should see a table that we just created. Describe keyspace ks1. So in fact, we do see table ks1.nocql. So, that's awesome. We created a table through the REST API. We have only used the Astra console to inspect what we have done. We have not executed anything. So now let's go and add some data to that table, and then we will have completed the process entirely through the DevOps API and the REST data API of creating a database and adding data to it, getting that data out via the API. And really once you have figured that out, you can use Astra to automate all kinds of processes in your build process. This is effectively what you need to accomplish CICD using Astra. You need an automateable way to create Astra databases and interact with them. And so that's what we've shown in this recording, is the steps in order to make all of that a reality.

Matt Kennedy (27:50): So, let's pop back to our terminal and do that final step of inserting some data. So to do that, we're back to our documentation for the Astra data API, and we need to look at how we add rows. So, this is our documentation for adding rows. We have again the fields that help us fill out the JSON that we're going to need. In this case, we are filling out the path parts here at the top. So, remember we still have the path parts in our variable. But what we don't have is what we want to put into that table. So, here I am going to say... Remember we had a name column and that value is going to be Matt, because I'm going to add a fact about me, and then I had a fact column, and the fact here needs a haircut. And then, we'll add one more to say when this happened, and the when is 2020-10-02, and then we can go ahead and basically take this string. Okay?

Matt Kennedy (29:04): I'm just going to steal that string part and use that with our other API call because we need to add the token. That's not in here, and we'll do that a little bit more easily via the terminal. So, let me just switch over to that window and we will finish this up. Okay. So, we have our URL. Now, remember we didn't use the path components from the documentation page because we grabbed the full URL from the JSON object that came back from the list databases DevOps API command. Okay?

Matt Kennedy (29:44): Now, what we had to add to that were basically the path to the data that we're talking about. So we have under keyspace, ks1, under tables, nocql, and now we're going to be looking at the rows components to post a new row. And, we have our couple of headers that we need in there, we have the uuidgen again, and we have our token that we got earlier from authenticating. And now what we need to do is say, "Hey, our data payload has this JSON object in it and it is going to set this one road to be a fact about Matt, that he needs a haircut. And, this was true as of the 2nd of October."

Matt Kennedy (30:26): So, let's go ahead and run that command. Right? So, we got 201 created, success true, one row modified. That's great. Let's go in to our CQL shell and make sure that we have data there, and then we've essentially completed the process. So, let's switch screens. Okay. We're back at the Astra console and we're already logged into the CQL console. So, what I want to do now is select from our table. And remember we called that nocql, and there in fact is the row that we added via the REST API. So just to recap really quickly, we logged into Astra to create a service account. Once we created that service account, we then checked out the documentation and learned how to use the API which we then used to authenticate to the DevOps API. Once we had that token, we were able to create a database. Once the database was created, we could get a reference to its data API URL, then we authenticate to the data API URL using the username and password that we used when we created the database. That gives us a token.

Matt Kennedy (31:48): From there we take that token and we use it to create a table, and put data into that table. So, now we've come full circle. We have shown how to use a database barely using the Astra UI at all, and that was really only to get us through a one-time setup that we needed to do in order to activate this capability at all. So, now you can use Astra either from the command line or from the new and improved user interface. It's up to you. Depends on the use case and what you're trying to accomplish in the moment. Either should be capable of doing everything that you need to do with an Astra database.

Matt Kennedy (32:32): All right. Moving on to our final topic, we are going to talk about our sample app gallery. So, you can find that in the left-hand nav for those that are seeing Astra for the first time. And when you get to the sample app gallery, we have a set of tiles that describe the various apps that you can deploy. And, there's different options for the different tiles. Some are able to basically just navigate you to the GitHub repo, some that we'll deploy directly into Gitpod which is what we're going to see in this demo, and others you can actually deploy via Netlify. So, now let's take a look at the launching of the Spring demo via the app gallery.

Spring Demo

Matt Kennedy (33:21): For this demo, we are back at our Astra dashboard. We're going to be looking at the sample app gallery and how to use that to get up and running really quickly with a Spring demo application. So, we'll go in there. This demo will be significantly shorter as it is much easier to follow along with than the prior demo using the REST API calls and all that good stuff. So, we have several examples here that you can click and get started with it by accessing GitHub or Gitpod. So, we can open this in GitHub and go straight to the code if that's what we want to do. But we can also look at this code in a running IDE with a simple click. So, let's go ahead and open this in Gitpod.

Matt Kennedy (34:12): Now, Gitpod is a web-based development environment. It can take a minute to set up. So, we're going to pause our recording here while it pulls down these Docker images and we will be right back. So as you can see, we have a blue progress bar across the top that is about to complete so Gitpod will be with us in a minute here. Now, you can see that everything from that repo is loaded into Gitpod and we have a build going on here. There is actually another one. So, we have two separate executables that are going to be a part of this demonstration. So, we have the spring-petclinic-reactive which is the backend that actually does the connection to Astra, and then we have our Angular front end which speaks to the reactive backend. So, these are going to take another minute to build. I'm going to pause the recording again and we'll come back when these are ready for inputs.

Matt Kennedy (35:16): So, we are back and our builds are all finished. We have the reactive backend and the Angular front end. Now, you may find that you have a prompt sitting here in this asking you to confirm some information sharing. You do have to say yes or no for that and then things will proceed. So, if you're running this demo yourself and you find that you're not getting to the step where the UI is running, that's probably why. Just check the output, make sure it's not waiting for a prompt. Similarly, for the reactive backend, you're going to get to a step where it is asking you for your service account credentials. And, I'll show you what that looks like in this setup.sh command. So it's essentially saying, "Hey, I want you to paste in the service account credentials here, and I'm going to read those in and use those to configure everything." And, it's actually a pretty good script to look at if you want to see another example of how to use the DevOps API with a different language to get everything set up and configured.

Matt Kennedy (36:25): So, remember we find those service account credentials under our organization. So we will go back down here, and this is where we would copy those credentials to our clipboard. So you'll be prompted for those, just paste those in. It may also ask you for a database password. So, just be ready to enter those when the time comes. But for now everything's built and we are ready to see what our Angular user interface looks like. So, let's open this local host link. So, I'm going to command click that as it tells me to do. And, this is going to launch the Angular user interface for the PetClinic. And, all I'm going to do right now is kind of prove to ourselves that this is in fact backed by Astra, and then I will leave it to you to explore the source code, and the configuration, and everything on your own time so you can learn how we use Spring with Astra.

Matt Kennedy (37:25): So, let's look at veterinarians. We have some veterinarians preloaded. So, you can see we have this list, one, two, three, four, five, six, seven vets. Let's see if we can find them in our database. So, we're going to go. We only have one database so that narrows it down. So, let's go ahead and connect via the CQL console. And, I'm going to enter my username here. Those of you who have used Astra maybe wondering why there is a significant pause between entering your username and the password prompt. The reason for that is we're doing something fairly clever here, where we are starting up the CQLSH container on demand. So, that container is not the thing that outputs the user prompt, but we pass in that username to the container once it's up and running, and then it prompts for the password. So, it's a pretty cool little trick we've done here to make the CQL console immediately interactive when you activate it, but we don't keep unneeded containers around wasting electricity.

Matt Kennedy (38:32): So, let's look for those vets. We know they're going to be in our keyspace. So the first thing we're going to do is say, "Use ks1." Then I use tab completion there just to have that spit out for me. And now I will say, "Describe keyspace ks1." And, I'm going to look for a table that might have those vets listed. So, good sign so far. I see that I have tables in here relating to the PetClinic. So here's ks1.petclinic_vet. So let's do a quick select * from ks1.petclinic_vet, and we'll see our set of veterinarians here. So, if we want to add one more veterinarian, we will have Dr. Vet. Dr. Vet, let's say is a radiological dentist. Now, let's make them him a surgeon. So, let's save that vet. And, we see that Dr. Vet, is in here. And, I am now going to see... If I run that same query I should see an additional vet in there. And, in fact I do. Here is Dr. Vet.

Matt Kennedy (39:56): So, we've just shown that we are running a Spring PetClinic app that has been customized for Astra. And, we've shown how to get started with that from our sample app gallery with just a click to load it into Gitpod, which gives you an environment completely based in the browser, you don't have to install anything, where you can explore this project and learn how you go about using Spring with Astra. All right. That concludes our demos for this webinar.

Matt Kennedy (40:30): So, just one quick note on that last demo. If you do try that a couple of times, something that can trip people up is if they have an existing environment in Gitpod. So check out, go directly to get gitpod.io and be sure to delete any environments that you aren't using anymore. So, that can cause a hiccup if you have a prior environment that was set up for a different database, for example. It will assume that you want to connect to the same database and that's not always the case.

Q&A

Matt Kennedy (41:03): So anyway, that said, we are now going to dive into some of the questions you've submitted. If you haven't had a chance to submit questions yet, please feel free to do so now via the Q&A widget on the left side of your screen, and we'll try to get to as many of them as we can. So, we have a few that have come in already. So, let's go back to the topic of Kubernetes first. Chris, is there a way to test a configuration change or version upgrade without targeting the entire cluster?

Chris Bradford (41:39): Yeah. So, that's actually a really, really neat feature. And, it just had an update in the most recent version of cass-operator that we've released, and it's called the Canary upgrade. And, what you do is you enable this parameter and it will only push that configuration change, or a version change to a single rack. So previously, we showed a multi rack scenario where there were three racks, this would just target two instances in the entire cluster, both of which are on the same rack. Then you can look at your monitoring beside, this looks like a good change, or you know what, maybe I want to go back and make this in a lower environment a little bit more, and choose whether or not you want to continue rolling it out or rollback. The most recent change involves allowing you to set the number of pods or nodes that you would like to make that change on. So instead of just targeting a single rack, you could target a single node inside of a single rack if you so desire.

Matt Kennedy (42:36): Cool. Another question is, is there a way to scale cluster compute down to zero with the operator?

Chris Bradford (42:48): Yeah. Yeah. So, in the Cassandra Datacenter CRD, their custom resource, there is a parked parameter or stop parameter, and you can set that to true or false. And, what it will do is actually scales the number of running containers all the way down to zero. But it keeps those volumes around. So, if you're using GKE, let's say you have a PDSSD volume for each pod, it will not delete those volumes. But you can scale your cluster size down when it's not in use. And, this is great for dev and test environments. Just to undo this behavior you just set the value to false, and opening all those resources back up with the disks attached.

Matt Kennedy (43:31): Awesome. So, very much how we are implementing the park functionality in Astra under the hood.

Chris Bradford (43:40): Yeah. It directly relates to that. Exactly.

Matt Kennedy (43:46): Cool. What configuration files are supported and how do additional files get supported?

Chris Bradford (43:53): So, that's really interesting. Right now, this is a really interesting piece of technology. DataStax has a product called OpsCenter, which has a component called Lifecycle Manager, which would handle deploying DataStax Enterprise into your instances of SSH. And we said, "Hmm, this looks like it takes a JSON config and turns it into configuration files on disk. We need this for the operator. We need to take JSON and turn it into, or in our case YAML, and turn it into rendered files on disk." So, we actually extracted that piece of code, open sourced it, and it operates off a collection of definition files. So, there's a GitHub repo, datastax/cass-config-definitions. And inside of that repo, you'll see a collection of version numbers, configuration files, and templates for the configuration files as well as definitions for the parameters that are allowed to be overridden.

Chris Bradford (44:48): We are very verbose in that and if we think it's something that will get used, there's already a parameter created for it. So, you can use that as a reference. We're also building out the documentation so you can reference that in a little bit cleaner format as well.

Matt Kennedy (45:05): Gotcha. So, we do have a question on any recommendations on tuning or optimization for a three node cluster? Do you have anything to say there?

Chris Bradford (45:21): Yeah. So, that's an interesting question. And, men, I would love to have a very specific answer for you. Unfortunately, when you go to tune your cluster, you're either tuning to your data model, to your hardware. If you're in Astra that's probably what was the data model. But there are a number of ways that you can do your tuning. And whether you're tuning for minimal latency or maximum of operations per second, those answers might be different. I don't have a one size catch all answer, but I do recommend you can reach out to us either through the help system inside of Astra. There's also the DataStax community site where you can ask these kinds of questions and you can get much more specific answers than I can give you right now, this second.

Matt Kennedy (46:08): I would say, "Don't necessarily jump to tuning." The first thing to do is make sure everything is actually running correctly. I'd leave the defaults where they are and just make sure that, especially things like NTP time synchronization. Make sure that's running, right? Not having that correct can cause some unexpected behavior, and that's one of the first things I'd check. Yeah. So, kind of my recommendation there is live with the defaults for as much as you can and have a really specific reason to tune if you do go down that road.

Matt Kennedy (46:51): All right. We had another question. I believe this question was in regards to using the DevOps API. The question is, how is this going to work if we were under VPC Service Control? And so, remember here that the Astra database is going to be running in its own VPC. And so, you are essentially passing commands to a command processor that lives in an independent VPC for many things that you are using as an operator. So yes, we do allow you to peer your application VPCs to the Astra VPCs, but that Astra VPC is still going to remain independent of that application VPC.

Matt Kennedy (47:38): And then we have a question. Is there a Terraform provider or similar available to enable infrastructure as code or GitOps for the Astra resources managed through the DevOps API? If not, is there something planned or being worked on? And, the answer there is yes. There is something planned and being worked on. And, it would be Chris, who would be the guy to answer that question but I don't know if that work is something that is ready to be talked about yet. Chris.

Chris Bradford (48:18): So for what it's worth, Terraform is extremely popular, right? And, you can look at the number of plugins that are available. It would be foolish to ignore it as a platform that we want to support or provide tooling with. So look for something in the future, but I can't give you the timelines right now.

Conclusion

Matt Kennedy (48:39): Yeah, fair enough. But great question. And then we have, is there any caching layer in Cassandra with auto invalidation upon updates? The answer there is, yes, there's a couple of cache options that are configurable. That is a fairly complex topic in its own right. So, we will make sure to reach out directly to the asker of that question in the follow-up in the next couple of days from these questions. So with that said, we are out of time for questions at the moment, and I want to thank you all for joining us today. And, we'd like to invite you to future DataStax events at datastax.com/webinars, and sign up at datastax.com for more information on Astra. Lastly, don't forget to check out our upcoming webinar, Everything you need to know about Storage Attached Indexing in Apache Cassandra. Thank you very much.

 

Chris Bradford (49:47):

Thank you.

 

Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.