Can I use Apache Cassandra with AWS?
Yes, you can use Apache Cassandra on AWS. Cassandra is available on AWS fully-managed through Astra DB, or self-managed via AWS Quick Start.
Apache Cassandra(®) is a leading NoSQL database, enabling developers to build massively scalable, geo distributed data applications with zero downtime. Cassandra is the database of choice for the most demanding applications on the internet including Netflix, Uber, Pinterest and thousands of the world’s leading engineering teams.
This guide will help you understand the best managed and self-managed ways to to run Cassandra on Amazon Web Services (AWS).
The fastest way to use Cassandra on AWS is with Astra DB, a database-as-a-service built on Cassandra, Kubernetes, Prometheus, Envoy, and other cuttting edge open source. Astra DB simplifies cloud-native application development and requires no operations or self-management. It reduces deployment time from weeks to minutes, delivering an unprecedented combination of serverless autoscaling, pay-as-you-go pricing, and an open source skillset you can take with you to any cloud provider. How does Astra DB make running on AWS easy?
Some IT organizations require complete control over their systems, or are already setup for self-managed software. With self-managed virtual machines you have that control. This control comes with all the associated effort and expense, and is a tradeoff that should be considered carefully.
K8ssandra is a cloud native distribution of Apache Cassandra® that runs on Kubernetes and AWS EKS. K8ssandra provides an ecosystem of tools to provide richer data APIs and automated operations alongside Cassandra. This includes metrics monitoring to promote observability, data anti-entropy services to support reliability, and backup / restore tools to support high availability and disaster recovery. As part of K8ssandra’s installation process, all of these components are installed and wired together, freeing you from having to perform the tedious plumbing of components:
DataStax Astra DB, which is a cloud-native DBaaS (database as a service) powered by Apache Cassandra, and managed by DataStax.
Deploy K8ssandra.io to Amazon Elastic Kubernetes Service or “EKS” via one of our convenient helm charts.
Download, install, configure, and operate your own open source Cassandra cluster on AWS EC2 virtual machine, or use an AMI you trust and understand.
Consolidated billing and counts towards existing EDP committments
Astra DB works smoothly with your serverless functions. Your Astra database automatically comes with data access APIs making integration to your AWS Lambdas straightforward and simple. Get the full value of autoscaling by paring autoscaling functions with autoscaling database.
Works with your applications deployed on EC2 in any language, via traditional language drivers or the APIs mentioned above.
Connect apps in your VPC to Astra DB via Privatelink, in the console or via API. No more databases exposed on public networks.
Deploy to any of our available AWS regions in United States, Europe, or Asia.
Failures handled gracefully by a Kubernetes (K8s) operator to keep databases healthy.
Simply register here with Github, Google ID or email and get 80 GB storage and up to 20M read/write ops free every month. No credit card is required for the free plan.
Create an account, login to Astra DB, create a database, choose AWS as the cloud provider, pick a region, and you are done!
Astra DB is easy to navigate and has CQL Console where you can run CQL queries without having to install any extra software on your computer.
Check out our videos and documentation if you're just getting started. The playlist has a wide range of short tutorials.
DataStax has created a wide variety of sample app examples to help you get things done in a faster and more efficient manner.
For development, you can use an AMI (Amazon Machine Image) with Cassandra already installed. Prebuilt AMIs are available from a variety of providers including AWS, Bitnami and others.
For most test, staging and production environments, consider an AMI you trust and understand. You may need to create an AMI from scratch for both security and runtime performance reasons.
Consider the staff and skills you may need to acquire or build. In your planning you should also account for time the team will require for ongoing configuration and maintenance of the database.
Download the Apache Cassandra(®) open source database and install and configure it.
Build a virtual machine that satisfies only the dependencies you need to run Cassandra and nothing more.
Configure AWS VPCs and other networking to ensure the cluster can communicate, and is also secure.
After installing Cassandra on AWS, both the VM, the operating system inside the VM, and the database itself needs to be managed and kept up to date with security patches and software updates.
Database management includes, but is not limited to, scaling the database according to the traffic, backup/restore, DR planning, capacity planning, and repair (anti-entropy).
Continuously monitor for failed operations and work on optimizing the configuration.
Keep pace with AWS EC2 and other related AWS service changes, updating the configurations accordingly to obtain the effective and efficient performance.
Administer the security policy of AWS infrastructure.
Install and Configure Terraform
Install and Configure AWS CLI v2
Install and Configure kubectl
Install and Configure helm
Install and Configure Python v3
Configure environment variables
Provision Infrastructure
Validate kubernetes cluster connectivity
Install K8ssandra
Deploy K8sssandra with Helm
Retrieve super user credentials
Learn more about setup for AWS EKS in the K8ssandra documentation.
This answer depends on your requirements, your existing investments, your staff and their skills - a host of factors.
In general, we recommend Astra DB for the vast majority of Cassandra use cases. You can be ready to go in minutes, freed from operational, security and scalability concerns
All but the most demanding, security-conscious applications will be served by environments like Astra DB that are already compliant to common security standards, saving months or even years of effort, to say nothing of expense.
Startups and enterprises alike who do not want to, or cannot, dive deep into database administration and configuration, should opt for Astra DB.
Self managing databases of Kubernetes is less efficient than DBaaS, but may be driven by preexisting organizational proficiency with Kubernetes. K8s managed services like AWS EKS and K8ssandra not only make running system-of-engagement databases on Kubernates possible, but can significantly ease the burden on SRE/Ops teams.
Self managing IaaS is the least efficient option relative to DBaaS, but may be driven by a need to self-manage for regulatory reasons or the need to interoperate with proprietary or custom systems. Alternatively, a self-managed IaaS may involve the nature of an existing application, being migrated to the cloud. Your application may simply not require, or be ready for, a cloud-native architecture.
Yes, you can use Apache Cassandra on AWS. Cassandra is available on AWS fully-managed through Astra DB, or self-managed via AWS Quick Start.
To deploy Cassandra on AWS, you can either:
Once you have deployed your Cassandra cluster on AWS, either by using Astra DB or creating a self-managed cluster, use the cluster’s connection string to access either from the command line, or through a Cassandra driver in your language of choice.
Astra DB has a free tier of $25 free credits monthly giving developers up to 80 gigabytes of free storage or up to 20 million read/writes each month. Astra DB is serverless so that you are only billed for what you use. If you’re managing your own cluster, your AWS pricing for the resources it uses will apply.
Astra DB is a fully managed, serverless, multi-cloud database as a service powered by Apache Cassandra®.
Yes, Astra DB is available on AWS Marketplace. There are no minimums and no upfront commitment required; your Astra DB cost will be billed to your AWS account.
Scale database resources in and out on demand to match application requirements and traffic so that you pay only for what you use. Put the power of Cassandra in the hands of every developer without ever worrying about managing the infrastructure.
Data replication across multiple data centers, availability zones, and multi-region. Scale-up to petabytes of data without impacting performance. The Astra service is resilient and highly available to minimize both downtime and the need for site-reliability engineering.
All data is encrypted at rest and in motion. Sophisticated authentication and authorization with role based access. Client connections use two-way certificate validation for VPN-level security from client to database. Private connectivity options like VPC peering upon request. JSON web token(JWT) based authentication to ensure secure connectivity to your Astra DB database.
Fully managed database and OS updates and upgrades. IaaS (Infrastructure-as-a-Service) failures handled gracefully by K8s operator to keep databases healthy. Eliminate anti-entropy repair procedures. Auto scaling eliminates manual configuration changes and guesswork on database sizing.