What’s New in Cassandra 2.0: Prototype Triggers Support
Warning: as of Cassandra 2.0.0, the ITrigger
interface and the rest of the triggers implementation are not final - and will change in 2.1. Please be aware of this before using triggers in production until at least Cassandra 2.1.
Overview
New Cassandra 2.0 prototype triggers rely on logged batches, originally added in Cassandra 1.2, to implement a flexible, atomic, eventually consistent mechanism for reacting to - and augmenting - write operations.
Cassandra triggers have instead of the event activation time and partition-level granularity. A coordinator node executes triggers before actually applying the mutations (locally or on the remote nodes), giving you the ability to alter the mutations-to-be, augment them with extra mutations, or execute any arbitrary code, really *. The coordinator takes the original mutations (potentially modified by the trigger), adds the extra mutations created by the trigger, and applies them together as one single logged batch, guarantying atomicity and eventual consistency.
It follows that triggers on counter tables are generally not supported (counter mutations are not allowed inside logged batches for obvious reasons - they aren't idempotent).
There are multiple potential use cases for Cassandra triggers:
- extra input validation - enforcing constraints beyond the data type validation performed by Cassandra
- replicating or migrating modifications from one table or keyspace to another
- incrementally updating a materialised view derived from one or more tables
- logging any mutations that meet particular conditions
- implementing alerts/notifications
- performing any other application-specific logic
Credit for the implementation goes to Vijay Parthasarathy.
Implementing a Trigger
The current (as of C* 2.0.0) ITrigger
interface itself is extremely simple:
public interface ITrigger
{
/**
* Called exactly once per CF update, returned mutations are atomically updated.
*
* @param key - Row Key for the update.
* @param update - Update received for the CF
* @return modifications to be applied, null if no action to be performed.
*/
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update);
}
It does (currently) expose some internal classes that should be explained:
RowMutation
represents changes to one or more tables so that 1) all the tables belong to the same keyspace, and 2) all the changes have the same partition key. These changes are grouped intoColumnFamily
objects (source).ColumnFamily
here shall contain the cells to be inserted and/or removed from their respective tables - oneColumnFamily
of changes per table (source).
The ColumnFamily
object passed to the augment
method is mutable, thus it's technically possible to interfere and alter the original mutation. It's also possible to create additional mutations for any table in any keyspace that will be performed together with the original changes as a single logged batch.
See the simplistic inverted index implementation for the augmented mutations example.
Operations
To create a trigger, you must first build a jar with a class implementing the ITrigger
interface and put it into the triggers directory on every node, then perform a CQL3 CREATE TRIGGER
request to tie your trigger to a Cassandra table (or several tables).
conf/triggers
is the default location for the trigger jars, but it can be redefined by setting the cassandra.triggers_dir
system property.
To add the trigger to a table, run
CREATE TRIGGER <name> ON [<keyspace>.]<table> USING '<class>'
to remove one, use
DROP TRIGGER <name> ON [<keyspace>.]<table>
Future Work
The current implementation is experimental, and there is some work to do before triggers in Cassandra can be declared final and production-ready. CREATE TRIGGER
should support parametrisation, so that triggers could be reused between different tables and configured without a need for external configuration files. It would be nice to be able to define triggers in CQL3 in addition to pure Java. And an API that doesn't reveal the internals (RowMutation
and ColumnFamily
classes) would be preferable to the current one.
That said, please do experiment with the current implementation and share your feedback - it will affect the final trigger design.
* while we do use a separate class loader for trigger classes, we don't sandbox the execution of triggers in any way. Be extra careful with the code that goes in augment
- it can negatively affect the whole node.