TechnologyAugust 19, 2014

Multi-Datacenter Cassandra on 32 Raspberry Pi’s

Multi-Datacenter Cassandra on 32 Raspberry Pi’s

Here at Datastax, my fellow intern Daniel Chin and I built a 32 node DataStax Enterprise cluster running on Raspberry Pi’s! We are showcasing the always on, fault tolerant nature of Cassandra by letting anybody take down an entire data center with the press of a Big Red Button in our lobby.

lobby_wide

Being able to withstand a data center going down is not just an edge case, it is an absolute necessity for the highly available applications Cassandra powers. While the cloud is far more flexible for production use, nothing beats a big shiny hardware display for a demo.

Our main goal for this project was to take the abstract concept of fault tolerance and make it something you can see in action and interact with. We built upon the work of Travis Price and the DataStax Sales Engineering team, who pioneered the use of Raspberry Pi's to demonstrate Cassandra.

cluster_closeup_final

The Build Process:

The Hardware:

 We wanted the overall display to look clean and professional enough to be appropriate for the lobby at our headquarters, but expose enough of the technology to be a compelling and unique demo.

As DataStax is a software company, fabricating hardware came with a unique set of challenges. ("Hey, do we have a shop vac?" "No.") I ended up drawing on my experience with Solidworks (a popular CAD program) from high school FIRST robotics to design all of the acrylic, and had it cut using a laser at a local machine shop. The assorted mounting hardware and the pedestal were sourced from McMaster Carr.

lots_of_pis

The Electronics:

Each Pi is running at its factory clock settings, and is completely unmodified. To avoid latency problems and to ensure our Pi's stayed online, we transitioned off of WiFi and used ethernet cables and switches instead.

To get power to each Pi, we use micro USB cables that are connected to five port USB hubs that are then plugged in to two power strips, one for each data center. This makes it easy to set up, and doesn't require building any custom power distribution rails.

Our large red button is connected to an Arduino that actuates a power relay to cut AC power to the network switch for Datacenter RED. The Arduino provides timing control, and makes the button inoperable during the network outage.

\daniel_assembling

The Software:

The cluster is set up as a two datacenter DSE 4.5 cluster, with Opscenter 5.0 running to show the status of all the nodes. As we expected, running a high performance enterprise database on a computer with a single core 700mHz processor and 512MB of RAM is not trivial.

We are using vnodes, and have throttled all of the cassandra.yaml values to the lowest intensity we can to squeeze C* to within our hardware constraints.

With Cassandra running, each Pi has 8 to 11 megabytes of free RAM.

For reference, our documentation currently recommends 16 cores and 24GB of RAM for a production system.

cables_wall While this cluster won't be setting speed benchmarks any time soon, we hope that it gets people excited about Cassandra and its incredible always on capabilities! pedestal_pic_teaser

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.