CompanyJanuary 31, 2018

Apache Cassandra® Community Gathering at Data Day Texas

Patrick McFadin
Patrick McFadinVP, Developer Relations & Cassandra Committer
Apache Cassandra® Community Gathering at Data Day Texas

Last weekend, Data Day Texas premiered a new Apache Cassandra® track--a great start to 2018 for our community--with presentations on data modeling, use cases, and operations, as well as the chance to see and collaborate with old friends and newcomers alike.

Last weekend, Data Day Texas premiered a new Apache Cassandra® track--a great start to 2018 for our community--with presentations on data modeling, use cases, and operations, as well as the chance to see and collaborate with old friends and newcomers alike.

This was the biggest year yet for Data Day Texas, selling out at 1000 attendees. We are humbled that our years of community investments continue to pay dividends with Cassandra developers because when the Cassandra community meets, it’s never a small thing. In addition to the Cassandra talks, there were tracks on Apache Spark, Graph, streaming technology, and even AI. If you are a data nerd, this was a pretty amazing day of talks.

So what did you miss in the Cassandra track? There were a lot of familiar names and faces from the many years of Cassandra community. Below are some of the highlights.

Jeff Carpenter on my DataStax evangelist team, author of Cassandra: The Definitive Guide, led off the track with “Cassandra Architecture FTW!“ Since we still have a lot of people new to the technology, he gave the track the right start with a deep technical dive into the architecture of Cassandra and why it’s a great choice for modern workloads. After his talk, he hung out at the DataStax booth for a book signing which quickly burned through the box of books he brought.

Ben Bromhead brought “Cassandra and Kubernetes“, which touches on a really hot topic with a lot of members in the community. Deploying  Cassandra on many machines can be difficult and can introduce a lot of potential for error. His talk introduced the concept of using Kubernetes to maintain a cluster and some of the important things you should know, like how to manage down nodes and add capacity to your cluster.

Jon Haddad, a consultant from The Last Pickle, showed off some of his chops on optimizing performance in a Cassandra cluster. Jon has been bringing performance talks since early days at the Cassandra Summit and he didn’t disappoint. He highlighted some key parameters you can adjust based on workload and make a big impact. For instance, when in a read-heavy workload, adjusting your disk IO parameters can drastically increase throughput and latency. You can read more in his posted slide deck.  

Dikang Gu and Pengchao Wang brought an update on some of the experimental work they have been doing at Facebook with “Cassandra pluggable storage engine.“ Using the RocksDB storage engine, they are looking for various ways to improve latency and throughput on specific workloads. You can read more about their proposal on Apache Jira. Dikang will also be on an upcoming Distributed Data Show talking about this topic, so keep a watch for that soon.

Aaron Ploetz, long time StackOverflow point leader for Cassandra, condensed his knowledge into “Performance Data Modeling at Scale.“ As anyone that has worked with Cassandra knows, the difference between a good experience and a great one is down to the data model. He outlined various use cases and how to model for them properly. He will be interviewed on the Distributed Data Show, too, so you won’t want to miss his insights there.  

Jonathan Ellis, DataStax co-founder, CTO, and VP of product management, has been talking about Apache Cassandra longer than anyone. He bought a brand new and timely topic to Data Day: “Cassandra and the Cloud“, which was a modern comparison of Cassandra and the leading proprietary cloud vendor databases. If you have never seen a Jonathan talk, let me tell you that deep dive is an understatement. He covered topics such as data models, consistency models, and replication. This will not be the last time this talk will be given, so keep an eye out for it when it comes to a city near you or catch him on the Distributed Data Show in an upcoming episode.

It was exciting to see a large Apache Cassandra community event, but more important, it was good to see the tribe gathering again, and welcoming new members to the Cassandra community. A lot of old friends and new all focused on making applications work with the world’s best open source database.

We are looking forward to an even bigger Data Day event in the fall, this time in San Francisco. If you are interested in giving a talk, let us know (community@datastax.com)  and we can put you in touch with the right people. The community would love to hear from you and what you can share about your use case. Hope to see you there!

Discover more
CommunityApache Cassandra®
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.