DataStax Enterprise 4.5.1 release notes¶
- Apache Solr 126.96.36.199.4 to 188.8.131.52.5
- Shark 0.9.1.1 to 0.9.1.2
- Cassandra Java Driver 2.0.2 to 184.108.40.206
- Post Solr query parameters are now properly stored in the audit logs.
- This release adds the update metrics mbean, which can be useful to guide tuning of all factors affecting indexing performance, such as back pressure, indexing threads, RAM buffer size and merge factor.
- The Weather Sensor demo for running analytical queries with Hadoop and Spark is now easier to run. You no longer need to change path variables in tarball installations.
- The Weather Sensor demo readme file has been improved.
- The Shark component, updated to 0.9.1.2, works with internal authentication.
- The showSchema method, which has been added to Spark, provides information about all user keyspaces, a particular keyspace, or a table.
- Improved default memory settings for Spark.
- When running the Portfolio Manager demo, messages about keyspace sensitivity no longer appear.
- Resolved the issue causing DataStax Enterprise to hang during shutdown, waiting for gossip to start. (DSP-3518)
- Fixed the out-of-memory error on huge clusters caused by Cassandra File System (CFS) memory consumption, which has been reduced significantly, approximately 500 times for some use cases. (DSP-3615)
- Fixed an issue when enabling clustering in Solr that caused DataStax Enterprise to complain about the org.tartarus.snowball.ext.DanishStemmer class not being found. (DSP-3645)
- Fixed the race condition between shutdown of client and server channels, causing a harmless Netty exception to appears when you shut down. (DSP-3651)
- Fixed the problem preventing the HSHA rpc server from functioning. (DSP-3675)
- RPM and Deb package installations now properly find the shark-env.sh file. (DSP-3696)
DataStax Enterprise 4.5 release notes¶
DataStax Enterprise 4.5 updates components and includes major enhancements improvements, patches, and bug fixes.
- Apache Cassandra 220.127.116.11
- Apache Hadoop 18.104.22.168
- Apache Hive 0.12.0.3
- Apache Pig 0.10.1
- Apache Solr 22.214.171.124.4
- Apache log4j 1.2.16
- Apache Sqoop 126.96.36.199.1
- Apache Mahout 0.8
- Apache Tomcat 6.0.39
- Apache Thrift 0.7.0
- Apache Commons
- Spark 0.9.1
- Shark 0.9.1.1
- JBCrypt 0.3m
- SLF4J 1.7.2
- Guava 15.0
- JournalIO 1.4.2
- Netty 4.0.13.Final
- Faster XML 3.1.3
- HdrHistogram 1.0.9
- Snappy 1.0.5
- Cassandra Java Driver 2.0.2
- Tarball: install_location/resources/cassandra
- GUI/Text Services installations: /usr/share/dse/resources/cassandra
- Package: /usr/share/doc/dse-libcassandra*
NEWS.txt is also posted on the Apache Cassandra project web site.
Enhancements and changes¶
- External Hadoop systems
- A bring your own Hadoop (BYOH) model that integrates Hadoop data warehouse implementations by Cloudera and Hortonworks
- Support for Kerberos-secured BYOH integration using the Cloudera Manager
- DSE Hadoop/Hive/Pig
- Support for the native protocol in Hive including the addition of 19 new Hive TBLPROPERTIES to support the native protocol
- Auto-creation of Hive databases and external tables for each CQL keyspace and table
- A new cql3.partition.key property that maps Hive tables to CQL compound primary keys and composite partition keys
- Support for HiveServer2
- Integration of the HiveServer2 Beeline command shell
- Support for expiring data in columns by setting TTL (time to live) on Hive tables.
- Support for expiring data by setting the TTL on Pig data using the cql:// URL, which includes a prepared statement shown in step 10 of the library demo.
- Improved integration of Apache Sqoop for importing RDBMS data and exporting Cassandra CQL data
- For performance, you can configure DSE Search/Solr to parallelize row reads.
- The default shard transport type has been changed from http to netty. If you upgrade to DataStax Enterprise 4.5, perform the upgrade procedure using the shard transport type of your old installation, and after the upgrade, change the shard transport type to netty. Start the cluster using a rolling restart.
- This release of DataStax Enterprise does not use Lucene compressed stored fields anymore for performance reasons. Subsequent releases will also not use these fields. (DSP-3484)
- When the Solr query time join from field is docValues=true, the faster doc values-based join system is used. Upgrading to DataStax Enterprise 4.5 requires reindexing if query time join is used.
- DataStax Enterprise 4.5 and later moves the DSE per-segment filter cache off-heap by using native memory, hence reducing on-heap memory consumption and garbage collection overhead.
- The new off-heap filter cache is enabled by default, but can be disabled by passing the following JVM system property at startup time: -Dsolr.offheap.enable=false.
- Query metric times are now logged by DSE Search.
DSE mbean names have been improved to decrease the chance that names will clash. The old naming format was com.datastax.bdp:type=name. The new format is com.datastax.bdp:type=workload,name=name. For example, com.datastax.bdp:type=search,name=SolrContainerPlugin.
- Soft commit adjustments during back pressure are now correctly executed. (DSP-3584)
- The problem that caused the old pre-reload value of the soft commit to be reloaded during back pressure has been resolved. (DSP-3584)
- Resolved a problem causing Tomcat to block shutdown when an out of memory error occurs. (DSP-3328)
- Resolved a problem caused by the disk full alert feature that removed a node from the ring. Because Cassandra will no longer automatically decommission a node when the disk is almost full, you need to monitor disk space and add capacity when necessary. (DSP-3601)
- The DSE init.d script sets -XX:HeapDumpPath when using jsvc fallback and also when using the default dse_daemon script. The latter was not being set, which prevented the heap from being dumped on an out-of-memory condition. (DSP-3308)
- Resolved a problem caused by calling Gossiper.instance.getEndpointStateForEndpoint
but not checking for a null return, which lead to null pointer exceptions like the
ERROR 11:02:42,841 Exception in thread Thread[Thread-1,5,main] java.lang.NullPointerException at com.datastax.bdp.gms.DseState.doGetCurrentState(DseState.java:269) at com.datastax.bdp.gms.DseState.setActiveStatus(DseState.java:167) at com.datastax.bdp.server.DseDaemon.stop(DseDaemon.java:470) at com.datastax.bdp.server.DseDaemon$1.run(DseDaemon.java:380)
- DataStax supports a data center that contains one or more nodes running in dual Spark/DSE Hadoop mode. DataStax does not support running some nodes in DSE Hadoop mode and some in Spark mode in the same data center. Dual Spark/DSE Hadoop mode means you started the node using the -k and -t options on tarball installations, or set the startup options HADOOP_ENABLED=1 and SPARK_ENABLED=1 on packaged installations. (DSP-3561)
- Due to a DSE_CLASSPATH problem, if you are installing DataStax Enterprise to use the bring your own Hadoop (BYOH) model, you need to install and configure DataStax Enterprise on all nodes, including nodes in the Hadoop cluster, as described in the installation procedure. (DSP-3654)
- Due to a race condition between shutdown of client and server channels, a harmless Netty
exception appears when you shut down. The exception looks something like
WARN 20:18:15,397 Failed to submit an exceptionCaught() event. java.util.concurrent.RejectedExecutionException: event executor terminated at io.netty.util.concurrent.SingleThreadEventExecutor.reject (SingleThreadEventExecutor.java:703) . . .
Ignore this exception. (DSP-3651)
In this release, the HSHA rpc server is not functioning. (DSP-3675)
- Writing to Blob columns from Spark is not supported in this release. Reading columns of all types is supported; however, you need to convert collections of blobs to byte arrays before serializing. (DSP-3620)
- After upgrading DataStax Enterprise from 4.0.0 or 4.0.1 to 4.5.x on RHEL5/CentOS5, the
Snappy JAR file will be missing. To get it back, either:
- Run the switch-snappy script:
$ cd /usr/share/dse ## Package installations $ cd install_location/bin ## tarball installations $ switch-snappy 1.0.4
- Uninstall the old installation and then do a fresh installation. Using, a regular uninstall maintains the configuration files and data files.
- Run the switch-snappy script:
- Cassandra static columns, introduced in Cassandra 2.0.6, cannot be included in the Solr schema (and hence indexed) for performance reasons because changing the value of a single static column would require re-indexing all documents sharing the same partition key. (DSP-3143)