CompanyJuly 2, 2018

Cassandra’s Journey — Via the Five Stages of Grief

Cassandra’s Journey — Via the Five Stages of Grief

New technologies usually need to fight their way into the hearts of the people who will end up using them. This fight is often long and hard, and Apache Cassandra didn’t have it any easier than any of the other technological developments of our time.

In 2008 the database world was a wild place. Large data infrastructures were testing the limits of relational databases, and companies like Google and Amazon were about to run out of options on how to handle their massive data volumes.

At the time I was working at an education company called Hobsons, and was one of those infrastructure engineers trying to get more scale out of my tired old databases. Cassandra caught my eye as something with a great foundation in computer science that also solved many of the issues I was having.

But not everyone was as convinced as I was.

If you’re not familiar with the  Kübler-Ross model of grieving, also known as The Five Stages of Grief, it describes a way most people end up dealing with loss and change. Looking back, I realize now that the en-masse giving up of relational databases to switch to something more appropriate for the new world of big data—Cassandra— very much followed this same model.

Here’s how it happened from my POV in the trenches of data infrastructure.

Stage 1: Denial - The individual believe the prognosis is somehow mistaken and clings to a false, preferable reality.

In 2008, Apache Cassandra was the closing curtain on a 30-year era of database technology, so denial was an easy and obvious response to it in the early years. Of course, many of the new databases being released weren’t exactly of the highest quality. Coming from a database with years and years of production vetting, it was easy to throw some shade at the newcomers.

Cassandra was in that camp.

But it could do things relational databases couldn’t, like stay online when physical nodes fail or scale online by just adding more servers. Administrators called it a toy and developers called it a fad — just some kids trying to be cool. Cassandra kept growing, though — and solving real problems. The replication story was unmatched and was catching a lot of attention. There were ways to replicate a relational database, but it was hard and didn’t work well. Data integrity required one primary database with all others being secondary or read-only, and failure modes contributed to a lot of offline pages displayed on web sites. But generally speaking people only want to make the effort to fix things when they absolutely have to, and for now, relational databases weren’t really broken.

Stage 2: Anger - The individual recognizes that denial cannot continue and becomes frustrated.

Slowly but surely people started to move notable use cases with real production workloads over to Cassandra. There were happy users talking about incredible stories of scale and resiliency! The company names attached to these stories became less cutting edge and more mainstream and it was becoming clear to many that this wasn’t just a fad. It was starting to make a real impact and could be coming to a project meeting soon.

I remember one of my first consulting gigs at a big-name company. I was working with the development team on some data models and in the back of the room was a group of engineers, arms crossed, not looking happy. When I talked to them, they made it quite clear that this change was not welcome, and that “This is going to ruin the company.” They were the Oracle database administrators and they saw this at best as a bad idea and at worst as a threat to their livelihood. In the ensuing months I experienced similar tense moments with other groups of engineers.

Stage 3: Bargaining - The individual tries to postpone the inevitable and searches for an alternate route.

Despite roadblocks and delay tactics, the needs of businesses everywhere dictated a move to high-scaling technologies like Apache Cassandra. It was solving real problems in a way no other database could and no matter how much “tuning” you could do on your other solutions.

This led to situations where teams started negotiating the terms of a Cassandra roll-out. One team I worked with wasn’t allowed to put Cassandra in any critical path close to customers. Ironically, when the systems in the critical path started failing, the only system that could withstand the conditions that led to their failure was the much-maligned Cassandra cluster.

Then, a new breed of database appeared that tried to capitalize on the fear of non-relational databases. It was called NewSQL and promised full ACID transactions along with Cassandra-like resiliency, but NewSQL never quite worked out when real-world failures presented themselves. That’s how infrastructure goes: It burns half-baked ideas to the ground and calls in a welcoming party for the good ideas.

Stage 4: Depression "I'm so sad, why bother with anything?"

Cassandra started gaining traction in every corner of the tech world. As the solutions implemented to avoid this inevitability failed, fighting the future became less and less appealing. There was a massive growth period when the early adopters became late adopters and they were talking. The relational database holdouts finally just stopped talking about it and did something else. Many decided to move to data warehousing where they could put their amazing SQL skills to use via complex queries.

Stage 5: Acceptance - The individual embraces the inevitable future.

And then, there was a moment, and nobody knows exactly when it was, that Cassandra became a mainstream database. It might have been when everywhere you looked there was yet another great use case being talked about. As the saying went, anyone doing something at scale on the Internet was probably using Cassandra. For me, the moment I realized Cassandra had finally been accepted was when I saw large numbers of database administrators signing up for training on DataStax Academy. It was like a big shift had occurred in the day-in, day-out world of databases. Application developers were always pushing the cutting edge, but administrators had to keep those applications running until they were replaced, and their new foundation of choice was Cassandra.

When you think about it, you really see the same reaction to every new paradigm-shifting technology. The early days of the computer, the Internet, and now blockchain all faced the same fear and doubt as the early days of Cassandra. Collectively—we deny the truth, rage at inevitability, scramble for an alternative, fall into despair, and finally accept and embrace our new reality. What comes after Cassandra is anyone’s guess, but as with people, usually the best kind of change comes little by little and goes almost completely unnoticed until it’s staring you in the face, and you say, “Wow — you’ve changed!” Here’s to the Cassandra of the past, the present, and the future.

Discover more
Apache Cassandra®
Share

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.