Databases Should Be the Most Boring Thing in Your Data Center
There are a lot of ways to test software, but distributed systems offer a different challenge just in the nature of how they work. It can be deceptively hard to predict how small failures in a single system can lead to pathological errors across large clusters. Testing requires a lot of automation and scale as developers seek quick feedback on new code being committed. Distributed databases like Cassandra need functional testing with production workloads under a variety of conditions. Making that work in a reliable and easy way is critical to balancing agile code development and confidence in what’s being built.
DataStax engineers love solving big distributed system problems and in this specific case, solving large scale distributed testing problems. This is something that we have been working on for quite a while and that code is now available as Fallout for download and use under the Apache License.
We realize that this isn’t a general issue for every end user of Apache Cassandra™, however having quality code in a production environment is something we can all agree is pretty important. This is one of the dominant themes for the upcoming Cassandra 4.0 release and as a part of our ongoing commitment to the Apache Cassandra project, we are opening Fallout to speed testing on builds. This used in conjunction with the Harry fuzz testing tool and dtest offers a powerful collection of automated peace of mind as developers continue to add features to Apache Cassandra. Most importantly, peace of mind to end users as you deploy Cassandra in your applications. This fits in nicely to the philosophy of “Databases should be the most boring thing in your data center.”
I’ll outline some of the key features, but another blog post will be following this one with more details and of course, if you need all the details, just follow the doc link below. Fallout as described above is a large scale distributed testing tool. It allows engineers to define and run experiments on a running cluster in an automated and repeatable fashion. These can be run in series or parallel depending on the type of test, but most important, as engineers add new experiments, the library of available testing can grow over time. In addition, Fallout integrates with NoSQLBench, which can simulate production workloads as the cluster is abused with whatever evil tests are designed. Because it was built from the ground up with this idea of making it work with existing testing tools and infrastructure, Fallout can be used to solve this kind of distributed testing problem for any database or piece of infrastructure. We hope by open sourcing this project, we will get more variety in the types of tests and workloads common in the Cassandra community as well as other databases commonly used.
I’m sure you are wondering about the scale part of Fallout and that’s the part I’m pretty excited about. An important part of operator sanity in scale scenarios is proper orchestration. Fallout uses Kubernetes to manage the Cassandra cluster life cycle utilizing our recently open sourced Kubernetes operator. This allows for a hands-off deployment and teardown of clusters under test so developers can concentrate on writing Cassandra code and experiments. If you want to go Kubernetes all the way down, we even have a docker image for you to run the Fallout service.
All the bits and docs can be found here:
GitHub Repo: https://github.com/datastax/fallout
Docker Image: https://hub.docker.com/r/datastax/fallout
Online Docs: https://github.com/datastax/fallout/tree/master/docs
If you have questions or problems, you can use GitHub issues for the project or if you prefer, email oss-fallout@datastax.com. Feedback is definitely welcome and stay tuned for more as we keep expanding and enhancing this project.