Kudu 1.5 now enables the optional ability to compute, store, and verify checksums on all pieces of data stored on a server by default. Due to storage format changes, downgrading to versions 1.3 or earlier is not supported and will result in an error.
Spark 2.2+ requires Java 8 at runtime even though Kudu Spark 2.x integration is Java 7 compatible. Spark 2.2 is the default dependency version as of Kudu 1.5.0.
The kudu-spark-tools module has been renamed to kudu-spark2-tools_2.11 in order to include the Spark and Scala base versions. This matches the pattern used in the kudu-spark module and artifacts.
To improve security, world-readable Kerberos keytab files are no longer
accepted by default. Set
--allow_world_readable_credentials=true to override
this behavior. See
KUDU-1955 for additional
Support for Java 7 is deprecated as of Kudu 1.5.0 and may be removed in the next major release.
Support for Spark 1 (kudu-spark_2.10) is deprecated as of Kudu 1.5.0 and may be removed in the next minor release.
Tablet servers are now optionally able to tolerate disk failures at
startup. This feature is experimental; by default, Kudu will crash if it
experiences a disk failure. When enabled, tablets with any data on the failed
disk will not be opened and will be replicated as needed. To enable this, set
--suicide_on_eio flag to
false. Additionally, there is a configurable
tradeoff between a newly added tablet’s tolerance to disk failures and its
parallelization of I/O via the
Tablets that are spread across fewer disks are less likely to be affected by a
disk failure, at the cost of reduced parallelism. Note that the first
configured data directory and the WAL directory cannot currently tolerate disk
failures, and disk failures during run-time are still fatal.
Kudu server web UIs have a new configuration dashboard (/config) which provides a high level summary of important security configuration values, such as whether RPC authentication is required, or web server HTTPS encryption is enabled. Other types of configuration will be added in future releases.
kudu command line tool has two new features:
kudu tablet move and
kudu local_replica data_size. The 'tablet move' tool moves a tablet replica
from one tablet server to another, under the condition that the tablet is
healthy. An operator can use this tool to rebalance tablet replicas between
tablet servers. The 'local replica data size' tool summarizes the space usage
of a tablet, breaking it down by type of file, column, and rowset.
kudu-client-tools now supports exporting CSV files and importing Apache Parquet files. This feature is unstable and may change APIs and functionality in future releases.
kudu-spark-tools now supports importing and exporting CSV, Apache Avro and Apache Parquet files. This feature is unstable and may change APIs and functionality in future releases.
The log block manager now performs disk synchronization in batches. This optimization can significantly reduce the time taken to copy tablet data from one server to another; in one case tablet copy time is reduced by 35%. It also improves the general performance of flushes and compactions.
A new feature referred to as "tombstoned voting" is added to the Raft
consensus subsystem to allow tablet replicas in the
state to vote in tablet leader elections. This feature increases Kudu’s
stability and availability by improving the likelihood that Kudu will be able
to self-heal in more edge-case scenarios, such as when tablet copy operations
fail. See KUDU-871 for
The tablet on-disk size metric has been made more accurate. Previously, the metric included only REDO deltas; it now counts all deltas. Additionally, the metric includes the size of bloomfiles, ad hoc indexes, and the tablet superblock. WAL segments and consensus metadata are still not counted. The latter is very small compared to the size of data, but the former may be significant depending on the workload (this will be resolved in a future release).
The number of threads used by the Kudu tablet server has been further reduced. Previously, each follower tablet replica used a dedicated thread to detect leader tablet replica failures, and each leader replica used one dedicated thread per follower to send Raft heartbeats to that follower. The work performed by these dedicated threads has been reassigned to other threads. Other improvements were made to facilitate better thread sharing by tablets. For the purpose of capacity planning, expect the Kudu tablet server to create one thread for every five "cold" (i.e. those not servicing writes) tablets, and an additional three threads for every "hot" tablet. This will be further improved upon in future Kudu releases.
The Java Kudu client now automatically requests new authentication tokens after expiration. As a result, long-lived Java clients are now supported. See KUDU-2013 for more details.
Multiple Kerberos compatibility bugs have been fixed, including support for environments with disabled reverse DNS, FreeIPA compatibility, principal names including uppercase characters, and hosts without a FQDN.
A bug in the binary prefix decoder which could cause a tablet server 'check' assertion crash has been fixed. The crash could only be triggered in very specific scenarios; see KUDU-2085 for additional details.
Kudu 1.5.0 is wire-compatible with previous versions of Kudu:
Kudu 1.5 clients may connect to servers running Kudu 1.0 or later. If the client uses features that are not available on the target server, an error will be returned.
Rolling upgrade between Kudu 1.4 and Kudu 1.5 servers is believed to be possible though has not been sufficiently tested. Users are encouraged to shut down all nodes in the cluster, upgrade the software, and then restart the daemons on the new version.
Kudu 1.0 clients may connect to servers running Kudu 1.5 with the exception of the below-mentioned restrictions regarding secure clusters.
The authentication features introduced in Kudu 1.3 place the following limitations on wire compatibility between Kudu 1.5 and versions earlier than 1.3:
If a Kudu 1.5 cluster is configured with authentication or encryption set to "required", clients older than Kudu 1.3 will be unable to connect.
If a Kudu 1.5 cluster is configured with authentication and encryption set to "optional" or "disabled", older clients will still be able to connect.
The Kudu 1.5 Java client library is API- and ABI-compatible with Kudu 1.4. Applications written against Kudu 1.4 will compile and run against the Kudu 1.5 client library and vice-versa, unless one of the following newly added APIs is used:
The Kudu 1.5 C++ client is API- and ABI-forward-compatible with Kudu 1.4. Applications written and compiled against the Kudu 1.4 client library will run without modification against the Kudu 1.5 client library. Applications written and compiled against the Kudu 1.5 client library will run without modification against the Kudu 1.4 client library.
The Kudu 1.5 Python client is API-compatible with Kudu 1.4. Applications written against Kudu 1.4 will continue to run against the Kudu 1.5 client and vice-versa.
Please refer to the Known Issues and Limitations section of the documentation.
Kudu 1.5 includes contributions from twenty-four people, including six first-time contributors:
Sri Sai Kumar Ravipati