DataStax Enterprise 4.5

Release notes

DataStax Enterprise 4.5.1 release notes

DataStax Enterprise 4.5.1 updates three components:
  • Apache Solr 4.6.0.2.4 to 4.6.0.2.5
  • Shark 0.9.1.1 to 0.9.1.2
  • Cassandra Java Driver 2.0.2 to 2.0.2.1
Enhancements and changes
  • Post Solr query parameters are now properly stored in the audit logs.
  • This release adds the update metrics mbean, which can be useful to guide tuning of all factors affecting indexing performance, such as back pressure, indexing threads, RAM buffer size and merge factor.
  • The Weather Sensor demo for running analytical queries with Hadoop and Spark is now easier to run. You no longer need to change path variables in tarball installations.
  • The Weather Sensor demo readme file has been improved.
  • The Shark component, updated to 0.9.1.2, works with internal authentication.
  • The showSchema method, which has been added to Spark, provides information about all user keyspaces, a particular keyspace, or a table.
  • Improved default memory settings for Spark.
  • When running the Portfolio Manager demo, messages about keyspace sensitivity no longer appear.

Resolved issues

  • Resolved the issue causing DataStax Enterprise to hang during shutdown, waiting for gossip to start. (DSP-3518)
  • Fixed the out-of-memory error on huge clusters caused by Cassandra File System (CFS) memory consumption, which has been reduced significantly, approximately 500 times for some use cases. (DSP-3615)
  • Fixed an issue when enabling clustering in Solr that caused DataStax Enterprise to complain about the org.tartarus.snowball.ext.DanishStemmer class not being found. (DSP-3645)
  • Fixed the race condition between shutdown of client and server channels, causing a harmless Netty exception to appears when you shut down. (DSP-3651)
  • Fixed the problem preventing the HSHA rpc server from functioning. (DSP-3675)
  • RPM and Deb package installations now properly find the shark-env.sh file. (DSP-3696)

DataStax Enterprise 4.5 release notes

DataStax Enterprise 4.5 updates components and includes major enhancements improvements, patches, and bug fixes.

Components

  • Apache Cassandra 2.0.8.39
  • Apache Hadoop 1.0.4.13
  • Apache Hive 0.12.0.3
  • Apache Pig 0.10.1
  • Apache Solr 4.6.0.2.4
  • Apache log4j 1.2.16
  • Apache Sqoop 1.4.4.14.1
  • Apache Mahout 0.8
  • Apache Tomcat 6.0.39
  • Apache Thrift 0.7.0
  • Apache Commons
  • Spark 0.9.1
  • Shark 0.9.1.1
  • JBCrypt 0.3m
  • SLF4J 1.7.2
  • Guava 15.0
  • JournalIO 1.4.2
  • Netty 4.0.13.Final
  • Faster XML 3.1.3
  • HdrHistogram 1.0.9
  • Snappy 1.0.5
  • Cassandra Java Driver 2.0.2
DataStax Community documentation covers release notes for Apache Cassandra 2.0.8 and patches:
 * Fix assertion error in CL.ANY timeout handling (CASSANDRA-7364)
 * Handle empty CFs in Memtable#maybeUpdateLiveRatio() (CASSANDRA-7401)
 * Fix native protocol CAS batches (CASSANDRA-7337)
 * Add per-CF range read request latency metrics (CASSANDRA-7338)
 * Fix NPE in StreamTransferTask.createMessageForRetry() (CASSANDRA-7323)
 * Add conditional CREATE/DROP USER support (CASSANDRA-7264)
 * Swap local and global default read repair chances (CASSANDRA-7320)
 * Add missing iso8601 patterns for date strings (CASSANDRA-6973)
 * Support selecting multiple rows in a partition using IN (CASSANDRA-6875)
 * cqlsh: always emphasize the partition key in DESC output (CASSANDRA-7274)
 * Copy compaction options to make sure they are reloaded (CASSANDRA-7290)
 * Add option to do more aggressive tombstone compactions (CASSANDRA-6563)
 * Don't try to compact already-compacting files in HHOM (CASSANDRA-7288)
 * Add authentication support to shuffle (CASSANDRA-6484)
 * Cqlsh counts non-empty lines for "Blank lines" warning (CASSANDRA-7325)
 * Make StreamSession#closeSession() idempotent (CASSANDRA-7262)
 * Fix infinite loop on exception while streaming (CASSANDRA-7330)
 * Reference sstables before populating key cache (CASSANDRA-7234)
 * Account for range tombstones in min/max column names (CASSANDRA-7235)
 * Improve sub range repair validation (CASSANDRA-7317)
 * Accept subtypes for function results, type casts (CASSANDRA-6766)
Merged from 1.2:
 * Handle possible integer overflow in FastByteArrayOutputStream (CASSANDRA-7373)
 * cqlsh: 'ascii' values weren't formatted as text (CASSANDRA-7407)
 * cqlsh: ignore .cassandra permission errors (CASSANDRA-7266)
 * reduce failure detector initial value to 2s (CASSANDRA-7307)
 * Fix problem truncating on a node that was previously in a dead state (CASSANDRA-7318)
 * Don't insert tombstones that hide indexed values into 2i (CASSANDRA-7268)
 * Track metrics at a keyspace level (CASSANDRA-6539)
 * Add replace_address_first_boot flag to only replace if not bootstrapped
   (CASSANDRA-7356)
 * Enable keepalive for native protocol (CASSANDRA-7380)
 * Check internal addresses for seeds (CASSANDRA-6523)
 * Fix potential / by 0 in HHOM page size calculation (CASSANDRA-7354)
 * Fix availability validation for LOCAL_ONE CL (CASSANDRA-7319)
 * Use LOCAL_ONE for non-superuser auth queries (CASSANDRA-7328)
 * Fix handling of empty counter replication mutations (CASSANDRA-7144)
NEWS.txt contains late-breaking information about upgrading from previous versions of Cassandra. A NEWS.txt or a NEWS.txt archive is installed in the following locations:
  • Tarball: install_location/resources/cassandra
  • GUI/Text Services installations: /usr/share/dse/resources/cassandra
  • Package: /usr/share/doc/dse-libcassandra*

NEWS.txt is also posted on the Apache Cassandra project web site.

Enhancements and changes

DataStax Enterprise 4.5 includes the following enhancements and changes:
  • Spark/Shark
    • Support for Apache Spark for running analytical queries independent of Hadoop
    • Support for Apache Shark, a SQL-like, Hive-compatible language built on top of Spark
  • External Hadoop systems
    • A bring your own Hadoop (BYOH) model that integrates Hadoop data warehouse implementations by Cloudera and Hortonworks
    • Support for Kerberos-secured BYOH integration using the Cloudera Manager
  • DSE Hadoop/Hive/Pig
    • Support for the native protocol in Hive including the addition of 19 new Hive TBLPROPERTIES to support the native protocol
    • Auto-creation of Hive databases and external tables for each CQL keyspace and table
    • A new cql3.partition.key property that maps Hive tables to CQL compound primary keys and composite partition keys
    • Support for HiveServer2
    • Integration of the HiveServer2 Beeline command shell
    • Support for expiring data in columns by setting TTL (time to live) on Hive tables.
    • Support for expiring data by setting the TTL on Pig data using the cql:// URL, which includes a prepared statement shown in step 10 of the library demo.
  • Sqoop
  • Solr
    • For performance, you can configure DSE Search/Solr to parallelize row reads.
    • The default shard transport type has been changed from http to netty. If you upgrade to DataStax Enterprise 4.5, perform the upgrade procedure using the shard transport type of your old installation, and after the upgrade, change the shard transport type to netty. Start the cluster using a rolling restart.
    • This release of DataStax Enterprise does not use Lucene compressed stored fields anymore for performance reasons. Subsequent releases will also not use these fields. (DSP-3484)
    • When the Solr query time join from field is docValues=true, the faster doc values-based join system is used. Upgrading to DataStax Enterprise 4.5 requires reindexing if query time join is used.
    • DataStax Enterprise 4.5 and later moves the DSE per-segment filter cache off-heap by using native memory, hence reducing on-heap memory consumption and garbage collection overhead.
    • The new off-heap filter cache is enabled by default, but can be disabled by passing the following JVM system property at startup time: -Dsolr.offheap.enable=false.
    • Query metric times are now logged by DSE Search.
    • DSE mbean names have been improved to decrease the chance that names will clash. The old naming format was com.datastax.bdp:type=name. The new format is com.datastax.bdp:type=workload,name=name. For example, com.datastax.bdp:type=search,name=SolrContainerPlugin.

Resolved issues

  • Solr
    • Soft commit adjustments during back pressure are now correctly executed. (DSP-3584)
    • The problem that caused the old pre-reload value of the soft commit to be reloaded during back pressure has been resolved. (DSP-3584)
    • Resolved a problem causing Tomcat to block shutdown when an out of memory error occurs. (DSP-3328)
  • Other
    • Resolved a problem caused by the disk full alert feature that removed a node from the ring. Because Cassandra will no longer automatically decommission a node when the disk is almost full, you need to monitor disk space and add capacity when necessary. (DSP-3601)
    • The DSE init.d script sets -XX:HeapDumpPath when using jsvc fallback and also when using the default dse_daemon script. The latter was not being set, which prevented the heap from being dumped on an out-of-memory condition. (DSP-3308)
    • Resolved a problem caused by calling Gossiper.instance.getEndpointStateForEndpoint but not checking for a null return, which lead to null pointer exceptions like the following (DSP-3616):
      ERROR 11:02:42,841 Exception in thread Thread[Thread-1,5,main]
      java.lang.NullPointerException
      at com.datastax.bdp.gms.DseState.doGetCurrentState(DseState.java:269)
      at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:167)
      at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:470)
      at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:380)

Issues

  • DataStax supports a data center that contains one or more nodes running in dual Spark/DSE Hadoop mode. DataStax does not support running some nodes in DSE Hadoop mode and some in Spark mode in the same data center. Dual Spark/DSE Hadoop mode means you started the node using the -k and -t options on tarball installations, or set the startup options HADOOP_ENABLED=1 and SPARK_ENABLED=1 on packaged installations. (DSP-3561)
  • Due to a DSE_CLASSPATH problem, if you are installing DataStax Enterprise to use the bring your own Hadoop (BYOH) model, you need to install and configure DataStax Enterprise on all nodes, including nodes in the Hadoop cluster, as described in the installation procedure. (DSP-3654)
  • Due to a race condition between shutdown of client and server channels, a harmless Netty exception appears when you shut down. The exception looks something like this:
    WARN 20:18:15,397 Failed to submit an exceptionCaught() event.
    java.util.concurrent.RejectedExecutionException: event executor terminated
    at io.netty.util.concurrent.SingleThreadEventExecutor.reject
    (SingleThreadEventExecutor.java:703) . . .

    Ignore this exception. (DSP-3651)

  • In this release, the HSHA rpc server is not functioning. (DSP-3675)

  • Writing to Blob columns from Spark is not supported in this release. Reading columns of all types is supported; however, you need to convert collections of blobs to byte arrays before serializing. (DSP-3620)
  • After upgrading DataStax Enterprise from 4.0.0 or 4.0.1 to 4.5.x on RHEL5/CentOS5, the Snappy JAR file will be missing. To get it back, either:
    • Run the switch-snappy script:
      $ cd /usr/share/dse ## Package installations
      $ cd install_location/bin  ## tarball installations
      
      $ switch-snappy 1.0.4
    • Uninstall the old installation and then do a fresh installation. Using, a regular uninstall maintains the configuration files and data files.
  • Cassandra static columns, introduced in Cassandra 2.0.6, cannot be included in the Solr schema (and hence indexed) for performance reasons because changing the value of a single static column would require re-indexing all documents sharing the same partition key. (DSP-3143)

DataStax Enterprise 4.5 Installer release notes

Components

  • Apache Cassandra 2.0.8
  • OpsCenter 4.1.4
  • DevCenter 1.1.x
Show/hide