Migration and Upgrade Options for Cassandra Environments

With the dramatic increase and change in data usage over the past year, the need to ensure your Cassandra environments are poised for the future is crucial. Great News: It couldn’t be a better time to take advantage of new developments in the Cassandra universe. While other upgrade options exist, now is the time to move with three options that many enterprises are evaluating and starting to use.

Option #1: Migrate to DataStax Astra, a leading Cassandra DBaaS

Astra DB is a fully managed, multi-cloud, and serverless DBaaS that scales up and down dynamically with pay-as-you-go pricing. Astra DB eliminates operational overhead delivering enterprises total cost of ownership (TCO) savings of up to 3-5 times over self-managed Cassandra clusters, according to a recent study by GigaOm. When you migrate to Astra, this will be the last time you will upgrade your Cassandra cluster.

When migrating to Datastax Astra, you can start by opening a free Astra account. No credit card required. When you are ready to begin the migration, experienced Cassandra professionals are available to help guide you through the process.

Path A: Maintenance window migration

If you have the option for the application to be down for a time, transferring data to Astra is a simple process using either a DSBulk-based utility or a Spark-based migrator. The DSBulk-based utility orchestrates DSBulk to easily perform the migration of all desired tables efficiently and with a configurable level of concurrency. The Spark-based migrator benefits from Spark's excellent parallelization capabilities and is ideal when the data being migrated needs to be filtered through custom logic based on the user's requirements.

Path B: Zero-downtime live migration

The Zero-downtime Live Migration tool offers radical changes to the migration process. This offering helps enterprises seamlessly migrate their Cassandra applications to Astra with zero application downtime and minimal, non-invasive configuration changes to client applications.

The heart of this zero-downtime migration is a cloud Proxy, which eliminates the need for application code changes and enables applications to continue working without interruption while historical data is migrated separately. While the Proxy is in action, the existing data in the origin cluster is migrated to Astra by using either a DSBulk-based utility or a Spark-based migrator as mentioned above.

For more details see: Four Steps to Migrate Live Data from Apache Cassandra to Astra with Zero Downtime

 

Option #2: Upgrade to Cassandra 4.0

Apache Cassandra 4.0 scales operations faster, introduces new auditing capabilities, includes increased security, has incremental repair improvements, gives additional visibility through virtual tables, adds Java 11 support, and the list goes on. Here is a great article, “It’s time to upgrade to Cassandra 4.0” for more. Or, watch this video, “Cassandra 4.0 is here: Why you should upgrade today” by Patrick McFadin.

The choice to upgrade to Cassandra 4.0, while well worth it, may take a variety of different paths depending on your current Cassandra environment. 

 

Option #3: Migrate from SQL to NoSQL (Astra DBaaS or Cassandra)

It’s easy to see why many companies that conduct comprehensive comparisons of SQL and NoSQL, often decide to move away from their long-standing relational database approach. NoSQL is a good fit for cloud environments, big data, real-time analytics, as well as modern use cases and applications that require a variety of data types, always-on availability, cost-effective horizontal scalability, and distributed low latency.

On the surface, making the move from RDBMS to NoSQL can seem like a headache. After all, there’s a lot to consider, like their very different data models and the need to denormalize the data when moving to NoSQL. But, it’s really not as difficult as it can seem.

More good news. We have two ready-made paths for SQL to NoSQL migration. One will help you move to Astra DBaaS and the other will lead you to Cassandra.

Path A: Migrating from SQL to Astra DBaaS

When describing Option #1 above, we covered many of Astra DB’s advantages. What if you decide to cut operational burdens out of your life with this fully managed DBaaS, but your starting point is a SQL database? To help out, we put together a simple four-step process to simplify things. 

We also created an easy-to-follow tutorial to give you hands-on experience with the process, as you move a fictional pet clinic’s data from a SQL database to AstraDB. You can set up a free Astra account—no payment or credit card required—to try it out. It’s actually better than free. We’ll give you a $25 USD monthly credit.

We’ll quickly review the four steps here, but make sure to try out the tutorial for a real-world test drive and much more detail.

Step #1: Create your Astra DB instance

Just like with Option #1, you’ll need an Astra account. As we’ve discussed, it’s fast, easy, and free to get started.

Your next step is to sign in to your new account and create a pay-as-you-go plan. That way, you only get charged for as much as you use (and you never have to go over the $25/month free credit). And, from there, you can easily set up your first database in Astra.

Step #2: Create your data model

With your database in place, it’s time to build your data model. With data normalization no longer a concern, you can build tables with all the information you’ll need for your expected queries in one place—making access efficient and fast.

Navigate to the Cassandra Query Language (CQL) Console and log in to the database. But, before you create tables, you need to establish and describe keyspaces—the containers for your application data. You’ll use keyspaces to group column families together. Once all the keyspaces in the database are described, you can create tables by executing CQL statements. See the tutorial for CQL statements you can copy and paste to try this out.

Step #3: Generate your Astra application token

Generate an application token, so you can securely connect Astra DB to your Cassandra database. After completing this step, the application token can be used for any applications or tools that talk to your database. And it’ll be needed in the next step to authenticate with DSBulk.

Step #4: Load data into Astra DB with DSBulk

Now, we can start thinking about loading SQL data into Astra DB. First, you’ll need a secure connect bundle to talk to your Astra DB instance with an external app. The bundle contains information about where our cluster is located in the cloud and how to securely connect to it.

Make sure you have DSBulk installed. You’ll need it to connect to your Astra DB database and the secure connect bundle, along with loading data from your relational database. The data is in the form of source CSV files generated from the SQL relational database. That SQL data is exported into a NoSQL table using a DSBulk command. Now, you can use the CQL Console in the Astra DB UI to view the data. See this step in the tutorial for much more detail.

Path B: Migrating from SQL to Cassandra 4.0

If you decide to migrate your data from SQL to Cassandra 4.0, you’ll want to prioritize which of your use cases can be most helped by the move. What are your biggest performance and scalability issues? Starting with the most impactful use cases and migrating piece-by-piece, instead of an all-at-once approach will reduce risk of something going horribly wrong. And thinking through your top use cases will also help you better understand the scope of the migration.

Similar to our guidance for SQL to Astra DBaaS migration above, we’ve outlined a straightforward four-step process to demystify moving from SQL to Cassandra.           

Step #1: Adapting your data model

Setting up solid Cassandra data models will provide the foundation for a successful migration. With tables, rows, columns, and CQL, Cassandra has some similarities to SQL databases. But make no mistake, Cassandra is NoSQL through-and-through and has significant differences that should impact how you set up your data models. As mentioned above, unlike relational databases, Cassandra uses denormalized data, and you can design tables that will quickly yield results to future queries. ​​A denormalized system often calls for one table to support one query—and that's okay.

Before going live with your new data models, make sure to fully test them. DataStax has developed and open-sourced a simple and powerful performance-testing tool called NoSQLBench for just that purpose. With it, you can put a load on a target cluster to test write and read performance, sizing, data model design, and deployment.

Learn more about NoSQLBench & Download it and get started.

Step #2: Adapting your application

Next, you’ll want to update your application code, so it can write and read from your newly designed Cassandra tables. We’ve developed drivers, in the most popular languages, to make it easy for you to connect to your Cassandra clusters.

Step #3: Planning your deployment

You’re getting close now, but next take the time to develop a thorough plan for your Cassandra cluster. This is time well spent. Careful and thoughtful planning now, will save you time in the future. Consider both the initial deployment and growth scenarios in your planning.

Step #4: Moving your data

It’s time to deploy your updated application to production. You just need to make sure any data still in your SQL database is moved into Cassandra. The DataStax Bulk Loader for Apache Cassandra® is a fast, easy-to-use command line utility that’s a great choice for this step. If zero-downtime is required for your migration, consider using Apache Kafka®, the Kafka Connect framework, or the DataStax Kafka Connector. Finally, take a look at Apache Spark to validate the source, SQL records were correctly moved to your Cassandra cluster.

Get more detail for each of these four steps and learn best practices for migrating from a SQL database to Cassandra.

Get started on your Cassandra migration

There are straightforward migration and upgrade options to help move your data into Astra DB or Cassandra 4.0. From there, your organization will be well positioned to handle even explosive data growth and wide-ranging data types, while taking advantage of a strong foundation for innovation.

Take a closer look at the benefits of moving your data to Astra DB.