TechnologyAugust 31, 2013

What’s New in Cassandra 2.0: Prototype Triggers Support

What’s New in Cassandra 2.0: Prototype Triggers Support

Warning: as of Cassandra 2.0.0, the ITrigger interface and the rest of the triggers implementation are not final - and will change in 2.1. Please be aware of this before using triggers in production until at least Cassandra 2.1.

Overview

New Cassandra 2.0 prototype triggers rely on logged batches, originally added in Cassandra 1.2, to implement a flexible, atomic, eventually consistent mechanism for reacting to - and augmenting - write operations.

Cassandra triggers have instead of the event activation time and partition-level granularity. A coordinator node executes triggers before actually applying the mutations (locally or on the remote nodes), giving you the ability to alter the mutations-to-be, augment them with extra mutations, or execute any arbitrary code, really *. The coordinator takes the original mutations (potentially modified by the trigger), adds the extra mutations created by the trigger, and applies them together as one single logged batch, guarantying atomicity and eventual consistency.

It follows that triggers on counter tables are generally not supported (counter mutations are not allowed inside logged batches for obvious reasons - they aren't idempotent).

There are multiple potential use cases for Cassandra triggers:

  • extra input validation - enforcing constraints beyond the data type validation performed by Cassandra
  • replicating or migrating modifications from one table or keyspace to another
  • incrementally updating a materialised view derived from one or more tables
  • logging any mutations that meet particular conditions
  • implementing alerts/notifications
  • performing any other application-specific logic

Credit for the implementation goes to Vijay Parthasarathy.

Implementing a Trigger

The current (as of C* 2.0.0) ITrigger interface itself is extremely simple:

public interface ITrigger
{    
	/**
	 * Called exactly once per CF update, returned mutations are atomically updated.
	 *
	 * @param key - Row Key for the update.
	 * @param update - Update received for the CF
	 * @return modifications to be applied, null if no action to be performed.
	 */
	public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update);
}

It does (currently) expose some internal classes that should be explained:

  • RowMutation represents changes to one or more tables so that 1) all the tables belong to the same keyspace, and 2) all the changes have the same partition key. These changes are grouped into ColumnFamily objects (source).
  • ColumnFamily here shall contain the cells to be inserted and/or removed from their respective tables - one ColumnFamily of changes per table (source).

The ColumnFamily object passed to the augment method is mutable, thus it's technically possible to interfere and alter the original mutation. It's also possible to create additional mutations for any table in any keyspace that will be performed together with the original changes as a single logged batch.

See the simplistic inverted index implementation for the augmented mutations example.

Operations

To create a trigger, you must first build a jar with a class implementing the ITrigger interface and put it into the triggers directory on every node, then perform a CQL3 CREATE TRIGGER request to tie your trigger to a Cassandra table (or several tables).

conf/triggers is the default location for the trigger jars, but it can be redefined by setting the cassandra.triggers_dir system property.

To add the trigger to a table, run

CREATE TRIGGER <name> ON [<keyspace>.]<table> USING '<class>'

to remove one, use

DROP TRIGGER <name> ON [<keyspace>.]<table>

Future Work

The current implementation is experimental, and there is some work to do before triggers in Cassandra can be declared final and production-ready. CREATE TRIGGER should support parametrisation, so that triggers could be reused between different tables and configured without a need for external configuration files. It would be nice to be able to define triggers in CQL3 in addition to pure Java. And an API that doesn't reveal the internals (RowMutation and ColumnFamily classes) would be preferable to the current one.

That said, please do experiment with the current implementation and share your feedback - it will affect the final trigger design.

* while we do use a separate class loader for trigger classes, we don't sandbox the execution of triggers in any way. Be extra careful with the code that goes in augment - it can negatively affect the whole node.

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.