CompanyMarch 31, 2020

Leading with Code for a Better Apache Cassandra and Kubernetes

Leading with Code for a Better Apache Cassandra and Kubernetes

As a part of our ongoing commitment to open source and the Apache Cassandra® community, DataStax is opening a version of a Kubernetes Operator for Cassandra. The code is freely available under an Apache License for anyone to use or modify. DataStax isn’t the first organization to create an open-source project for a Cassandra Kubernetes operator and that is to the point of what we are trying to accomplish by releasing this code.

As managing infrastructure has been standardizing around Kubernetes, many organizations are looking at the data plane as something that should also be managed under the same umbrella. DataStax realized that too when building out Astra, a database-as-a-service (DBaaS) built on Cassandra. Earlier this year, we started talking to the community about combining efforts and rallying around using Kubernetes to manage Cassandra. The response was immediate and overwhelmingly positive. Organizations like Sky, Orange, and Instaclustr all agreed that it was time for the community to get behind this effort and help make Cassandra the default database for Kubernetes. What was needed first was to have our code released for all to see. We want to lead with code and not just opinions. Those of you who work with open source projects know there will be plenty of opinions as we go along.

Just like every hero that has a sidekick that seems to do most of the work, we are also releasing a Management API sidecar for Cassandra that brings Cassandra closer to how Kubernetes wants to work. If you have managed Cassandra clusters in the past, you know that’s accomplished with either a command line or accessing the Java Management Extensions (JMX). Neither of which is how we manage cloud-native infrastructure and can create a serious impediment to operators.

With the same spirit of creating a path to a single community choice, we have not only opened the repository and code under an Apache License, but we have started the conversation of donating components to the Apache Cassandra project. There has already been a sidecar effort underway and we are offering feature code to help it ship as soon as possible. The list of contributors from Apple and Netflix already lend incredible expertise on how to run Cassandra at scale. DataStax is happy to contribute our knowledge-as-code to better enable Cassandra operators everywhere. 

 

Where to find the code and get started? 

All code is available on GitHub. The README.md files have quickstarts and usage walkthroughs.

Cassandra Operator: https://github.com/datastax/cass-operator

Management API: https://github.com/datastax/management-api-for-apache-cassandra

 

More in-depth documentation for the operator can be found here. You will find details on all the options available in the YAML files and how to create custom configurations. The management API is documented here where you can get a complete list of all API calls with parameters.

We have a really cool demo if you want to see it in action. Even if you aren’t familiar with Kubernetes, you’ll see how a future of Casandra management could be a lot better.

 

What’s next

Now, that we have reached this milestone in releasing our version of code, the next phase is in the community. Expect to see some activity in the Apache Cassandra developer mailing list soon, organizing an effort for participation in a community-driven operator. Even if you haven’t built an operator or management sidecar for Cassandra, we could use as much diverse experience as possible.

If your job is deploying and running Cassandra, your input will be incredibly valuable. Consider joining one of the many ways we communicate about the future of the Apache Cassandra project and help us build an amazing future with Cassandra and Kubernetes. 

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.