How to Write a Dtest

What are Dtests?

Apache Cassandra’s functional test suite, cassandra-dtest, short for “distributed tests”, is an open-source Python project on GitHub where much of the Apache Cassandra test automation effort takes place. Unlike Cassandra’s unit tests, the dtests are end-to-end, black box tests that run against Cassandra clusters via CCM. The Cassandra Cluster Manager, or CCM, is a Python library that runs local C* clusters by hosting multiple JVMs on the same box. Each test’s runtime is anywhere from thirty seconds to several minutes. Many are general purpose functional tests, while others are regression tests for specific tickets from the Apache Cassandra JIRA.

Where are Dtests used?

Continuous integration for dtests runs on a publicly accessible Jenkins server at cassci.datastax.com. As patches are written, contributors can use CassCI to run the C* unit tests and the dtest suite against their new code, as discussed here.

Writing a Dtest

Adding a new dtest is quite simple. You’ll want to choose the appropriate module and/or test suite for your new test, or add one if necessary. Add a new test method to the file you’ve chosen; make sure that “test” is in the method name, or nosetests won’t pick it up. Now is a good time to add your test’s docstring. The docstring should include a description of what your test is trying to verify and how, as well as some doxygen markup. See dtest’s contributing.md for more on the appropriate doxygen annotations to use.

Now that the boilerplate is taken care of, you’re ready to begin writing your test. The first step is to launch a C* cluster, like so:

1 2	`cluster` `=` `self.cluster` `cluster.populate(3).start(wait_for_binary_proto=True)`

You can modify the number of nodes in the cluster, the number of datacenters, or any of the cassandra.yaml options.

cluster = self.cluster

cluster.set_configuration_options(values={'hinted_handoff_enabled': False}) # Set a cassandra.yaml option

cluster.populate([2, 2]).start(wait_for_binary_proto=True) # A four node cluster. Two nodes in each of two datacenters

Remember that this is using CCM, so all of these processes are running on your laptop. Thus, it’s best not to launch more than five nodes. Most tests run against three nodes.

To create an object representing a connection to your C* cluster, you’ll want to use one of the following methods from dtest.py:

def cql_connection(self, node)

def exclusive_cql_connection(self, node)

def patient_cql_connection(self, node)

def patient_exclusive_cql_connection(self, node)

Use patient_cql_connection, unless you have a specific need for one of the others.

cluster = self.cluster

cluster.populate(3).start(wait_for_binary_proto=True)

node1, node2, node3 = cluster.nodelist()

session = self.patient_cql_connection(node1)

From here out will be the actual testing logic. You can use the Python driver to interact with C*, mostly via CQL, or the ccmlib API to run cassandra-stress, nodetool, or any other tool that ships in the C* source.

session.execute("CREATE KEYSPACE ks WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1} AND DURABLE_WRITES = true")

session.execute("USE ks")

session.execute("CREATE TABLE t (id int PRIMARY KEY, v int)")

session.execute("INSERT INTO t (id, v) VALUES (1, 2)")

rows = session.execute("SELECT * FROM t")

node1, node2, node3 = cluster.nodelist()

node1.stress(['write', 'n=1M', '-rate', 'threads=10'])

node2.decommission()

node3.repair()

You can use assertions.py and Python unittest’s built-in assertions to assert C*’s correctness.

from assertions import assert_one

session.execute("CREATE KEYSPACE ks WITH replication = { 'class':'SimpleStrategy', 'replication_factor':1} AND DURABLE_WRITES = true")

session.execute("USE ks")

session.execute("CREATE TABLE t (id int PRIMARY KEY, v int)")

session.execute("INSERT INTO t (id, v) VALUES (1, 2)")

assert_one(session, "SELECT * FROM t", [1, 2])

rows = list(session.execute("SELECT * FROM t"))

self.assertEqual(rows[0], 1)

self.assertEqual(rows[1], 2)

Make sure you only use these, and not the Python assert keyword, as they offer significantly improved debug output on failures.

rows = list(session.execute("SELECT * FROM t"))

assert rows[0] == 1 # Do not do this.

assert rows[1] == 2

There’s no need to check for errors in C* logs, as that is automatically handled for you by dtest’s teardown.

Once you have finished with your test, make sure your new code is compliant with PEP8. See contributing.md for how to do so, along with further style guidelines. Now just open a pull request against the riptano/cassandra-dtest repository, and we’ll be happy to review and merge it.

How to Write a Dtest

What are Dtests?

Where are Dtests used?

Writing a Dtest

Discover more

DataStax AI Platform:
The Fastest Way to Build and Deploy AI Apps

What are Dtests?

Where are Dtests used?

Writing a Dtest

More Technology

What Are the Components of an AI Stack?

Migrate from Solr to SAI for Accelerated Development and Performance: Part 3

Understanding AI Agent Architectures in Langflow

A Decade of Apache Cassandra® Data Modeling

DataStax AI Platform:
The Fastest Way to Build and Deploy AI Apps

One-Stop Data API for Production GenAI

Subscribe to AI++

What are Dtests?

Where are Dtests used?

Writing a Dtest

Discover more

DataStax AI Platform:The Fastest Way to Build and Deploy AI Apps

What are Dtests?

Where are Dtests used?

Writing a Dtest

More Technology

What Are the Components of an AI Stack?

Migrate from Solr to SAI for Accelerated Development and Performance: Part 3

Understanding AI Agent Architectures in Langflow

A Decade of Apache Cassandra® Data Modeling

DataStax AI Platform:The Fastest Way to Build and Deploy AI Apps

One-Stop Data API for Production GenAI

DataStax AI Platform:
The Fastest Way to Build and Deploy AI Apps

DataStax AI Platform:
The Fastest Way to Build and Deploy AI Apps