CompanyDecember 16, 2018

New Major Versions of the DataStax Node.js Drivers

Jorge Bay Gondra
Jorge Bay Gondra
New Major Versions of the DataStax Node.js Drivers

We've just released version 4.0 of the DataStax Node.js Driver for Apache Cassandra and version 2.0 of the DataStax Enterprise Node.js Driver.

Let's have a look at some of the noteworthy features and changes in these releases.

Request Logging and Tracking

The driver now exposes a RequestLogger that allows tracking requests which are considered slow and/or large based on your defined thresholds of size and time.

This feature is enabled by providing an instance of RequestLogger when creating the Client:

const requestTracker = new cassandra.tracker.RequestLogger({ slowThreshold: 1000 });
const client = new Client({ contactPoints, localDataCenter, requestTracker });

You can subscribe to 'slow', 'large', 'normal' and 'failure' events using the emitter object instance:

requestTracker.emitter.on('slow', message => console.log(message));

An example message would be:

[10.1.1.1:9042] Slow request, took 1305 ms (request size 35 bytes / response size 1 KB): SELECT col1, col2 FROM table1 WHERE id = ? [1]

Additionally, you can provide your own tracker by implementing the RequestTracker interface. Check out the documentation for more information.

Object Mapper

We introduced an Object Mapper in the driver package. This new Object Mapper lets you interact with your data like you would interact with a set of documents, we've dedicated a separate blog post for the Mapper that goes over the features of this new driver component.

New Default Load-Balancing Policy for DataStax Enterprise

The DSE driver used a dedicated load-balancing policy that behaved very much like the TokenAwarePolicy, distributing the load between replicas in a random fashion, with additional logic to route graph queries.

Using a randomized scheme has proven to successfully balance the load uniformly in a distributed system, without requiring any additional communication from components, loading server nodes almost equally.

In the case of DataStax Enterprise, as not all queries requires the same effort from the server coordinator, and a coordinator might be undergoing a task that consumes more resources, we can expect for the incoming requests to take different times to complete and server-side request queues to be of different sizes.

We looked for ways to better distribute the queries from the client side to minimize completion time of the operations. After experimenting with different algorithms using different workloads and scenarios, we found that selecting the coordinator based on an internal client-level signal from two random replicas, as defined in the paper The Power of Two Choices in Randomized Load Balancing, proved to effectively improve overall latency behaviour and reduce long latency tail.

As a result, the driver now selects the replica with less in-flight requests from two random replicas. Additionally, the load-balancing policy detects replicas that are unresponsive and de-prioritize them from the query plan.

JavaScript primitive type BigInt Support

Node.js runtime added support for arbitrary-precision integers using the new ECMAScript BigInt type on version 10. On the driver side, you can now use JavaScript BigInt type to represent CQL bigint (64-bit signed long) and/or varint (arbitrary-precision integer) types.

To enable it, you must specify it in the ClientOptions:

const client = new Client({
   contactPoints,
   localDataCenter,
   encoding: {
      useBigIntAsLong: true,
      useBigIntAsVarint: true
   }
});

Metrics API

We exposed several internal driver metrics in the form of counters in 2 different ways: 1) A default implementation which leverages the Node.js events API to expose different counter increments and push it in your existing application metrics toolkit; and 2) a ClientMetrics interface that can be used by metrics libraries, service providers and the community to implement support for existing toolkits like metrics, datadog, prometheus, measured, …

To use the event-based implementation, you can subscribe to DefaultMetrics 'increment' events, for example:

client.metrics.responses.success.on('increment', () => driverResponsesCounter.inc());

client.metrics.errors.clientTimeout.on('increment', () => driverClientTimeoutCounter.inc());

client.metrics.speculativeExecutions.on('increment', () => driverSpecExecsCounter.inc());

You can check out the available metrics on the API docs.

Local Data Center Name Is Now a Required Setting

Previously, when a local data center (DC) was not provided a DC-aware load-balancing policy, the driver used to infer the local data center used from the provided contact points. In the case the user provided contact points from multiple DCs the driver defined one of the them as local, depending on which contact point was attempted first. This behaviour can lead to unexpected connections and traffic to a remote data center.

In this new version of the driver, we took the opportunity to make the local data center an explicit setting. The driver will not attempt to infer it when it was not defined, throwing an error instead.

To specify the local data center, you set it at ClientOptions level alongside your contact points:

const client = new Client({ contactPoints, localDataCenter: 'datacenter1' })

If you are using a single DC setup for testing/staging environment, the default DC name for Apache Cassandra deployments is 'datacenter1' and 'Cassandra' for DSE default single DC installs.

If you already specify the local DC at the load-balancing policy level, it will continue to work the same way.

Retry Policy and Query Idempotency

The RetryPolicy is not engaged anymore when a query errors with a WriteTimeoutException or request error and the query was not idempotent.

In order to control the possibility of retrying when a timeout/error is encountered, you must mark the query as idempotent. You can define it at QueryOptions level when calling the execution methods.

client.execute(query, params, { prepare: true, isIdempotent: true })

Additionally, you can define the default idempotence for all executions when creating the Client instance:

const client = new Client({ contactPoints, localDataCenter, queryOptions: {
   isIdempotent: true
}})

Upgrade Information and Conclusion

Releasing a major version allowed us to improve the API and remove misfeatures. As a result we made some breaking changes.

You can visit the upgrade guide for the Apache Cassandra Driver and the upgrade guide for the DSE Driver for the full list of breaking changes. As a rule of thumb, if you are not using custom policies you only have to consider the two changes mentioned above:

  • Specifying the local data center name is now required.
  • You should mark your idempotent queries as such, if you want to enable query retrying.

We hope you enjoy the new features in this new major version of the driver. Thanks again to all who contributed code to the driver, wrote documentation, made feature requests and reported bugs. We encourage you to stay involved:

Discover more
Drivers
Share

One-stop Data API for Production GenAI

Astra DB gives JavaScript developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.