TechnologyJune 20, 2011

Brisk 1.0 Beta 2 Released

Brisk 1.0 Beta 2 Released

DataStax has released Brisk 1.0 Beta 2! You can download Brisk from the DataStax web site.

New Features in Brisk 1.0 Beta 2

The following new features have been added in this release:




Apache Pig Integration. See the DataStax Documentation for more information about using Pig in Brisk.


Job Tracker Failover. See the DataStax Documentation for more information about using the new brisktool movejt command.


New Snappy Compression Codec built on Google Snappy is now used internally for automatic CassandraFS block compression.


Automap Cassandra Column Families to Hive Tables in the Brisk Hive Metastore.


Add a second HDFS layer in CassandraFS for long-term data storage. This is needed because the blocks column family in CFS requires frequent compactions - Hadoop uses it during MapReduce processing to store small files and temporary data. Compaction cleans this temporary data up after it is not needed anymore. Now there is the cfs:/// and cfs-archive:/// endpoints within CFS. The blocks column family in cfs-archive:/// has compaction disabled to improve performance for static data stored in CFS.

Major Fixes in Brisk 1.0 Beta 2

Brisk 1.0 Beta 2 also incudes the following major fixes. For details on all fixes in Beta 2, see the Brisk Jira Project Web site:




Remove multiple slf4j warnings


Use batchMutate instead of insert in HiveCassandraOutputFormat


Cassandra super columns not mapping in Hive


Improve performance of hadoop fs -ls


Compaction issue causing secondary index corruption.

Open Issues

For a description of the open issues in Brisk, see the Brisk Jira Project Web site.

About Brisk

Brisk is an open-source Hadoop and Hive distribution developed by DataStax that utilizes Apache Cassandra for its core services and storage. Brisk provides Hadoop MapReduce capabilities using CassandraFS, an HDFS-compatible storage layer inside Cassandra. By replacing HDFS with CassandraFS, users are able to leverage their current MapReduce jobs on Cassandra’s peer-to-peer, fault-tolerant, and scalable architecture. Brisk is also able to support dual workloads, allowing you to use the same cluster of machines for both real-time applications and data analytics without having to move the data around between systems.

Brisk is available via Apache license v2.0, and contains the following components:

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.