Apache Cassandra™ 1.2

Table attributes

The following attributes can be declared per table.
Option Default value
bloom_filter_fp_chance 0.01 or 0.1 (Value depends on the compaction strategy.)
bucket_high 1.5
bucket_low 0.5
caching keys_only
column_metadata N/A (container attribute)
column_type Standard
comment N/A
compaction_strategy SizeTieredCompactionStrategy
compaction_strategy_options N/A (container attribute)
comparator BytesType
compare_subcolumns_with BytesType*
compression_options sstable_compression='SnappyCompressor'
default_validation_class N/A
dclocal_read_repair_chance 0.0
gc_grace 864000 (10 days)
key_validation_class N/A
max_compaction_threshold 32
min_compaction_threshold 4
memtable_flush_after_mins N/A*
memtable_operations_in_millions N/A*
memtable_throughput_in_mb N/A*
min_sstable_size 50MB
name N/A
populate_io_cache_on_flush False
read_repair_chance 0.1or 1 (See description below.)
replicate_on_write true
sstable_size_in_mb 160MB
tombstone_compaction_interval 86400 seconds [1 day]
tombstone_threshold 0.2

* Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility.

compaction_strategy_options
(Default: N/A - container attribute) Sets attributes related to the chosen compaction-strategy. Attributes are:
bloom_filter_fp_chance
(Default: 0.01 for SizeTieredCompactionStrategy, 0.1 for LeveledCompactionStrategy) Desired false-positive probability for SSTable Bloom filters. When data is requested, the Bloom filter checks if the requested row exists before doing any disk I/O. Valid values are 0 to 1.0. A setting of 0 means that the unmodified (effectively the largest possible) Bloom filter is enabled. Setting the Bloom Filter at 1.0 disables it. The higher the setting, the less memory Cassandra uses. The maximum recommended setting is 0.1, as anything above this value yields diminishing returns. For detailed information, see Tuning Bloom filters.
bucket_high
(Default: 1.5) Size-tiered compaction considers SSTables to be within the same bucket if the SSTable size diverges by 50% or less from the default bucket_low and default bucket_high values: [average-size × bucket_low, average-size × bucket_high].
bucket_low
(Default: 0.5) See bucket_high for a description.
caching
(Default: keys_only) Optimizes the use of cache memory without manual tuning. Set caching to one of the following values:
  • all
  • keys_only
  • rows_only
  • none

Cassandra weights the cached data by size and access frequency. Use this parameter to specify a key or row cache instead of a table cache, as in earlier versions.

chunk_length_kb
(Default: 64KB) On disk SSTables are compressed by block (to allow random reads). This subproperty of compression defines the size (in KB) of the block. Values larger than the default value might improve the compression rate, but increases the minimum size of data to be read from disk when a read occurs. The default value (64) is a good middle-ground for compressing tables. Adjust compression size to account for read/write access patterns (how much data is typically requested at once) and the average size of rows in the table.
column_metadata
(Default: N/A - container attribute) Column metadata defines these attributes of a column:
  • name: Binds a validation_class and (optionally) an index to a column.
  • validation_class: Type used to check the column value.
  • index_name: Name of the index.
  • index_type: Type of index. Currently the only supported value is KEYS.

Setting a value for the name option is required. The validation_class is set to the default_validation_class of the table if you do not set the validation_class option explicitly. The value of index_type must be set to create an index for a column. The value of index_name is not valid unless index_type is also set.

Setting and updating column metadata with the Cassandra CLI requires a slightly different command syntax than other attributes; note the brackets and curly braces in this example:

[default@demo ] UPDATE COLUMN FAMILY users WITH  comparator =UTF8Type
AND  column_metadata =[{column_name: full_name, validation_class: UTF8Type, index_type: KEYS }];
column_type
(Default: Standard) The standard type of table contains regular columns.
comment
(Default: N/A) A human readable comment describing the table.
compaction_strategy
(Default: SizeTieredCompactionStrategy) Sets the compaction strategy for the table. The available strategies are:
  • SizeTieredCompactionStrategy: The default compaction strategy and the only compaction strategy available in releases earlier than Cassandra 1.0. This strategy triggers a minor compaction whenever there are a number of similar sized SSTables on disk (as configured by min_compaction_threshold). Using this strategy causes bursts in I/O activity while a compaction is in process, followed by longer and longer lulls in compaction activity as SSTable files grow larger in size. These I/O bursts can negatively effect read-heavy workloads, but typically do not impact write performance. Watching disk capacity is also important when using this strategy, as compactions can temporarily double the size of SSTables for a table while a compaction is in progress.
  • LeveledCompactionStrategy: The leveled compaction strategy creates SSTables of a fixed, relatively small size (5 MB by default) that are grouped into levels. Within each level, SSTables are guaranteed to be non-overlapping. Each level (L0, L1, L2 and so on) is 10 times as large as the previous. Disk I/O is more uniform and predictable as SSTables are continuously being compacted into progressively larger levels. At each level, row keys are merged into non-overlapping SSTables. This can improve performance for reads, because Cassandra can determine which SSTables in each level to check for the existence of row key data. This compaction strategy is modeled after Google's leveldb implementation. For more information, see the articles When to Use Leveled Compaction and Leveled Compaction in Apache Cassandra.
comparator
(Default: BytesType) Defines the data types used to validate and sort column names. There are several built-in column comparators available. The comparator cannot be changed after you create a table.
compare_subcolumns_with
(Default: BytesType) Required when the column_type attribute is set to Super. Same as comparator but for the sub-columns of a super column. Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility.
compression_options
(Default: N/A - container attribute) Sets the compression algorithm and subproperties for the table. Choices are:
  • sstable_compression
  • chunk_length_kb
  • crc_check_chance
crc_check_chance
(Default 1.0) When compression is enabled, each compressed block includes a checksum of that block for the purpose of detecting disk bitrot and avoiding the propagation of corruption to other replica. This option defines the probability with which those checksums are checked during read. By default they are always checked. Set to 0 to disable checksum checking and to 0.5, for instance, to check them on every other read.
default_validation_class
(Default: N/A) Defines the data type used to validate column values. There are several built-in column validators available.
dclocal_read_repair_chance
(Default: 0.0) Specifies the probability of read repairs being invoked over all replicas in the current data center. Contrast read_repair_chance.
gc_grace
(Default: 864000 [10 days]) Specifies the time to wait before garbage collecting tombstones (deletion markers). The default value allows a great deal of time for consistency to be achieved prior to deletion. In many deployments this interval can be reduced, and in a single-node cluster it can be safely set to zero.
key_validation_class
(Default: N/A) Defines the data type used to validate row key values. There are several built-in key validators available, however CounterColumnType (distributed counters) cannot be used as a row key validator.
max_compaction_threshold
(Default: 32) In SizeTieredCompactionStrategy sets the maximum number of SSTables processed by a minor compaction.
min_compaction_threshold
(Default: 4) In SizeTieredCompactionStrategy sets the minimum number of SSTables to trigger a minor compaction.
memtable_flush_after_mins
Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
memtable_operations_in_millions
Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
memtable_throughput_in_mb
Deprecated as of Cassandra 1.0, but can still be declared for backward compatibility. Use commitlog_total_space_in_mb.
min_sstable_size
(Default: 50MB) The SizeTieredCompactionStrategy groups SSTables for compaction into buckets. The bucketing process groups SSTables that differ in size by less than 50%. This results in a bucketing process that is too fine grained for small SSTables. If your SSTables are small, use min_sstable_size to define a size threshold (in bytes) below which all SSTables belong to one unique bucket.
populate_io_cache_on_flush
(Default: false) Adds newly flushed or compacted sstables to the operating system page cache, potentially evicting other cached data to make room. Enable when all data in the table is expected to fit in memory. See also the global option, compaction_preheat_key_cache.
name
(Default: N/A) Required. The user-defined name of the table.
read_repair_chance
(Default: 0.1 or 1) Specifies the probability with which read repairs should be invoked on non-quorum reads. The value must be between 0 and 1. For tables created in versions of Cassandra before 1.0, it defaults to 1. For tables created in versions of Cassandra 1.0 and higher, it defaults to 0.1. However, for Cassandra 1.0, the default is 1.0 if you use CLI or any Thrift client, such as Hector or pycassa, and is 0.1 if you use CQL.
replicate_on_write
(Default: true) Applies only to counter tables. When set to true, replicates writes to all affected replicas regardless of the consistency level specified by the client for a write request. For counter tables, this should always be set to true.
sstable_size_in_mb
(Default: 160MB) The target size for SSTables that use the leveled compaction strategy. Although SSTable sizes should be less or equal to sstable_size_in_mb, it is possible to have a larger SSTable during compaction. This occurs when data for a given partition key is exceptionally large. The data is not split into two SSTables.
sstable_compression
(Default: SnappyCompressor) The compression algorithm to use. Valid values are LZ4Compressor available in Cassandra 1.2.2 and later), SnappyCompressor, and DeflateCompressor. Use an empty string ('') to disable compression. Choosing the right compressor depends on your requirements for space savings over read performance. LZ4 is fastest to decompress, followed by Snappy, then by Deflate. Compression effectiveness is inversely correlated with decompression speed. The extra compression from Deflate or Snappy is not enough to make up for the decreased performance for general-purpose workloads, but for archival data they may be worth considering. Developers can also implement custom compression classes using the org.apache.cassandra.io.compress.ICompressor interface. Specify the full class name as a "string constant".
tombstone_compaction_interval
(Default: 86400 seconds [1 day]) The minimum time to wait after an SSTable creation time before considering the SSTable for tombstone compaction. Tombstone compaction is the compaction triggered if the SSTable has more garbage-collectable tombstones than tombstone_threshold.
Note: Cassandra will perform extra compactions when the amount of tombstones in a data file exceeds tombstone_threshold. The data file will be compacted by itself, and tombstones that are no longer needed are discarded. However, if data for the tombstone's partition exists in other data files, the tombstone cannot be discarded because it may be needed to indicate that data is deleted. The tombstone_compaction_interval represents how soon Cassandra allows retrying a tombstone compaction for a given data file. Therefore low values may result in repeated ineffective compaction attempts until the tombstone partition is merged with the other data files by a normal compaction event.
tombstone_threshold
(Default: 0.2) A ratio of garbage-collectable tombstones to all contained columns, which if exceeded by the SSTable triggers compaction (with no other SSTables) for the purpose of purging the tombstones.
Show/hide