TechnologyMarch 28, 2023

How Mongoose Will Bring JSON-Oriented Developers to Apache Cassandra

A new partnership between the open source projects Stargate and Mongoose will create a fully idiomatic experience for JavaScript developers on Apache Cassandra.
How Mongoose Will Bring JSON-Oriented Developers to Apache Cassandra

Apache Cassandra® is becoming the best database for handling JSON documents. If you’re a Cassandra developer who finds that statement provocative, read on.

In a previous post, I discussed using data APIs and data modeling to mold Cassandra into a developer experience more idiomatic to the way developers think, thus improving developer productivity while preserving reasonable database performance and scale. It’s a great hypothesis, and one that needs to be tested in the context of a particular developer idiom and developer community.

Mongoose, an object data mapping library usually used with the MongoDB driver, is an open source project with a significant community of JavaScript developers around it. At the Stargate project, the open source API data gateway designed to work with Cassandra, we’ve partnered with Mongoose, and we’re working on an upcoming JSON API that will be released together with a version of Mongoose that will work through that JSON API to connect to Cassandra. This creates an end-to-end stack for Mongoose developers that is fully open source. It’s game-changing for Mongoose developers, and opens an important new chapter for Cassandra. 

In this post, I’ll discuss how to provide a developer-friendly JSON idiom using Cassandra together with Stargate, and how we’re working to do just that for Mongoose developers.

The Goldilocks of JS communities

In October 2022, we released a new version of Stargate. With the new Version 2, individual APIs are no longer embedded in the core Stargate coordinator code, but instead separated out into individual services. This improves Stargate’s operational efficiency; individual API services can now be deployed and scaled independently. This also makes new API services easier to develop. As long as they abide by the service boundary, these services can be developed in parallel with and independent of core Stargate development work.

We then looked for a truly idiomatic experience we could deliver to developers. With 18 million developers, JavaScript is the world’s most popular programming language, and JSON is the standard way that JavaScript developers structure data. However, 18 million people is not a community; it is many communities. We needed the “Goldilocks” of JavaScript communities — big enough to be significant, but small enough to be focused. We found the right community built around Mongoose, an object data mapper library used with applications that connect to MongoDB. Mongoose has several important characteristics:

  • It’s JavaScript-centric
  • It enjoys broad adoption, with roughly 2 million GitHub repositories listing Mongoose as a dependency
  • Mongoose creator Valeri Karpov’s active leadership provides a clear focus
  • It’s an open source project that has lacked an open source database since MongoDB’s decision to move to a shared source model with its Server Side Public License

Developers don’t really interact directly with a database so much as a data model. In Stargate’s original Document API, the API handles JSON by making it look like a traditional Cassandra table. This puts a burden on JSON-oriented developers to think in terms of Cassandra data structures, and puts a burden on Cassandra’s row-oriented indexing logic because a JSON document gets spread across multiple rows.

Our new JSON API departs from this data model, and instead relies on a data model we call “super shredding.” You can learn more about super shredding at Aaron Morton’s talk at Cassandra Forward, a free digital event on March 14. In short, we take advantage of Cassandra’s wide-column nature to store one document per row, knowing that a Cassandra row can handle even very large documents. We also have a set of columns in that row that are explicitly for storing standard metadata characteristics of a JSON document. Now we have something more easily indexable, as well as a means of preserving and retrieving metadata.

We will then front this data model with our new JSON API, using the same mQuery specification that Mongoose uses as our guiding requirement for which calls the API needs to support. When complete, this should enable any of the more than 2 million Mongoose-dependent applications to run against open source Cassandra or DataStax’s hosted Cassandra service, Astra DB, with just a configuration change.

With Mongoose and the new JSON API, we’ll provide a fully idiomatic experience to JSON-oriented JavaScript developers, giving them the scale and performance of Cassandra underpinning an authentic JSON data model.

Mongoose creator Karpov will also speak at the Cassandra Forward event, demonstrating a simple e-commerce application that uses the Stargate version of Mongoose, open source Stargate and the DataStax Enterprise (DSE) version of Cassandra. You’ll be able to download the working code for this application and the supporting platform pieces from GitHub. While we have enough code to run this application, we are not yet code complete. For example, we run against DSE right now because we need storage-attached indexing (SAI), which works with DSE and is planned for release in Cassandra 5.0 later this year.

Contributing back to Cassandra

Cassandra isn’t a static piece of software; it’s a vibrant and evolving open source project. So we are also continuing a longstanding Cassandra tradition of using features like SAI that emerge client-side to foster changes on the database side. Similarly, Stargate’s Mongoose work has prompted a set of proposals for Cassandra around global sort and advanced query filtering that will not only make Stargate’s JSON API and Mongoose client better, but will add powerful new features to Cassandra Query Language. This is a great reminder that data engineers and application developers are not two different communities, but complementary cohorts of the extended Cassandra community.

And JSON is just the first step. Essentially, what we will have done is to take the building blocks of Cassandra, Stargate and a reasonably efficient Cassandra data model and build a document database that you interact with through a JSON API. In other words, we’ve used super shredding to create a purpose-built database that better serves the community of Mongoose developers.

With the modular architecture of Stargate v2, and the proof point of Mongoose for the idiomatic approach, we are ready to take on new developer communities that organize around a particular software development idiom. The process by which we’ve harnessed Cassandra for Mongoose is repeatable — and it’s one that we will repeat. In so doing, we dramatically expand the number of developers and use cases that Cassandra can address, which is the sort of goal worthy of an open source project.

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.