CompanyMay 16, 2017

DataStax Drivers Fluent APIs for DSE Graph are out!

Kevin Gallardo
Kevin Gallardo
DataStax Drivers Fluent APIs for DSE Graph are out!

Fire up your IDE, the time has come!

Following the DataStax Enterprise 5.1 release, DataStax released its first non-beta versions of the Fluent APIs for DSE Graph.

This new feature brings the DataStax Enterprise Drivers into full compatibility with the Apache TinkerPop GLVs, and we even included additional functionalities in order to make the experience of developing graph applications even faster and easier.

"You said 'GLV'"?

"Like... Great Lasagna Volume?"

No.

Most graph databases enthusiast nowadays should be aware of the existence of the Apache TinkerPop project, and its main component, the Gremlin traversal language (and if not, no Lasagna for you tonight). The Gremlin language is syntactically simple (although may be semantically complex). It is exclusively composed of chains of function calls, or nested function calls, which allow to define a Graph Traversal: the chain of steps that will produce or return the desired data from your Graph database.

Due to its syntactic simplicity, the Gremlin language can be expressed into any programming language that supports chaining or nesting functions (which includes every major programming language). A version of Gremlin in a specific programming language, is called Gremlin Language Variant (and that will be referred to as "GLV" in the rest of this post). Therefore, any Gremlin expressed in a programming language is a GLV, even the original Gremlin-Java.

"If they're all variations, is there a common representation?"

One Gremlin to rule them all.

Since the introduction of the GLV concept in TinkerPop, there needed to be a representation of Gremlin that was language-agnostic. So that any GLV would be an adaptation, in the desired programming language, of the language-agnostic Gremlin. This language-agnostic Gremlin is the Gremlin Bytecode. The Gremlin Bytecode is not meant to be directly exposed to users, however it is useful for a database driver ("why is that?" will be explained further down) and is made in a way that any GLV traversal can be translated into Bytecode (i.e. the generic representation), and vice versa.

Examples of GLV with Gremlin-Java and Gremlin-Python

Gremlin is defined by a set of function calls that can be chained or nested. GLVs can express Gremlin but vary based on the syntactic capabilities of the programming language the GLV is defined in, in order to make the use of Gremlin even more convenient and natural for the specific programming language's users. The two most common GLVs at the moment are Gremlin-Java and Gremlin-Python.

A Gremlin-Java traversal example:

g.V().values("name").range(0, 4).groupCount()

Gremlin-Python can make use of Python's collections syntax to allow rewrite this traversal into the following:

g.V().name[0:4].groupCount()

Notice how .name[0:4] replaces .values("name").range(0, 4) in the Gremlin-Python GLV. There a few other examples like this that are good to consider and be aware of when making use of a GLV.

"Enough of this nonsense, what do the DataStax Drivers have to do with this?"

Well in our newest version of DataStax Enterprise Graph, we have added the ability to consume graph traversals in this language-agnostic Bytecode format. Since this is now possible, it means that all the official GLVs can be used against DataStax Enterprise Graph.

In order to make this work, a language driver needs to be able to gather a traversal from the GLV into its generic Bytecode format, and then send it to the Database server. Drivers to do so previously existed only in the TinkerPop project. We have now also adopted this feature - and extended the original GLV capabilities - with the latest release of our DataStax Drivers, to provide what we now call the Fluent API.

Full Size Render

Once DSE Graph receives a traversal in the Bytecode format, it translates it back into a traversal of the server's Gremlin Traversal Machine runtime language, and processes it.

DataStax now offers clean and concise utilities that will allow users to use GLVs against DSE Graph backed by the DataStax Enterprise Drivers.

The Fluent API will allow users to interact with DSE Graph via the Gremlin Traversal API, providing a more familiar interface than the existing String-based queries interface, allowing compile-time checking, and easy navigation through the Traversal API within an IDE client-side.

It comes with all the DataStax Enterprise Drivers benefits

When using a GLV backed by a DataStax Enterprise Driver (also called "Fluent API"), users directly benefit from all the advantages and advanced features offered by the DSE Drivers, paired to the ease of use of the GLV Traversal API. With the Fluent API, users can take advantage of the DSE Drivers':

  • automatic cluster discovery
  • built-in load-balancing features
  • datacenter awareness
  • failure recovery policies (RetryPolicyReconnectionPolicy)
  • speculative executions
  • enterprise-grade client authentication
  • client-server encryption
  • advanced logging capabilities

And many other features that are essential to any application in production, which all come with zero added effort.

It also comes with extended functionalities!

In addition to exposing the original GLVs, the Fluent API exposes additional traversal features that are made especially for DataStax Enterprise Graph. Therefore, you can leverage DSE Search specific features, integrated in DataStax Enterprise, directly through the Fluent API. The Fluent API exposes DSE Search predicates that will automatically leverage the server-side search engine without having to think about using another tool, interface or language. We now expose Geometric and Geographic-based search predicates as well as advanced full-text search predicates. Have a look at the new predicates here.

Code examples, because we're all here for that.

The DSE Drivers have released artifacts and packages that will provide you with the new almighty utility class DseGraph:

  • It will be the entry point to using the Fluent API.
  • The DseGraph class will provide a method to easily create a TraversalSource, which is the entry point for creating Graph Traversals.
  • Once the TraversalSource is obtained, users can easily create Traversals and for example make a statement out of it, to execute in a DseSession.

It is no more complicated than: (Java)

import com.datastax.dse.graph.api.DseGraph;

GraphTraversalSource g = DseGraph.traversal();
GraphTraversal traversal = g.V().values(
"name").range(0, 4).groupCount();
GraphStatement statement = DseGraph.statementFromTraversal(traversal);

GraphResultSet results = dseSession.executeGraph(statement);

and with the Python Fluent API:

from dse_graph import DseGraph

g = DseGraph.traversal_source()
traversal
= g.V().name[0:4].groupCount()
statement
= DseGraph.query_from_traversal(traversal)

results = dse_session.execute_graph(statement)

Note: a DseSession has to be initialized and connected to a DataStax Enterprise cluster in order for this to work. See DataStax Drivers documentation for how to create a DseSession.

The DseGraph utility class also provides the option to get a TraversalSource that is directly connected to the remote DSE Graph server, via a DseSession:

import com.datastax.dse.graph.api.DseGraph;

// use an initialized and connected DseSession

GraphTraversalSource g = DseGraph.traversal(dseSession);
List results = g.V().values(
"name").range(0, 4).groupCount().toList();

and with Python:

from dse_graph import DseGraph

g = DseGraph.traversal_source(session=dse_session)
results
= g.V().name[0:4].groupCount().toList()

As in the example above, with a connected TraversalSource, each iteration operation built in Gremlin itself (toList()next()) will transparently execute the task of translating the traversal into Bytecode, send it via the DSE Driver to the DSE Graph server, and gather results back.

Wrap up

We will be looking forward to adding support to even more GLVs in the future, keep on extending their functionalities with the Fluent APIs, and overall improve the process of developing Graph applications.

A deep dive into the Gremlin Traversal Machine awaits you to learn more about the internals of Gremlin.

The documentation for the DataStax Java Driver Fluent API is located here, and the Python Fluent API here. Do not forget to check the other great features the DataStax Enterprise Drivers provide, and give us as much feedback as you can via the DataStax Academy Slack #datastax-drivers room!

Edit (07/03/2019): while this post and its principles still hold true, the APIs have changed in the latest DSE Java Driver release, for the new DSE Driver 2.x documentation, check this link. Also DataStax has now contributed a C# Fluent API and a Node.js Fluent API to the Apache TinkerPop project.

Discover more
DSE GraphDrivers
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.