Apache Kudu 1.11.1 Release Notes

Apache Kudu 1.11.1 is a bug-fix release which fixes one critical licensing issue in Kudu 1.11.0.

Upgrade Notes

When upgrading from earlier versions of Kudu, if support for Kudu’s NVM (non-volatile memory) block cache is desired, install the memkind library of version 1.8.0 or newer as documented in Kudu Installation for corresponding platform. This is a mandatory step for existing users of the NVM block cache (i.e. those who set --block_cache_type=NVM for kudu-master and kudu-tserver): they must install memkind, otherwise their Kudu processes will crash at startup.

Fixed Issues

  • Fixed an issue with distributing libnuma dynamic library with kudu-binary JAR artifact. Also, fixed the issue of statically compiling in libnuma.a into kudu-master and kudu-tserver binaries when building Kudu from source in release mode. The fix removes both numactl and memkind projects from Kudu’s thirdparty dependencies and makes the dependency on the libmemkind library optional, opening the library using dlopen() and resolving required symbols via dlsym() (see KUDU-2990).

  • Fixed an issue with kudu cluster rebalancer CLI tool crashing when running against a location-aware cluster if a tablet server in one location doesn’t contain a single tablet replica (see KUDU-2987).

  • Fixed an issue with connection negotiation using SASL mechanism when server FQDN is longer than 64 characters (see KUDU-2989).

  • Fixed an issue in the test harness of the kudu-binary JAR artifact. With this fix, kudu-master and kudu-tserver processes of the mini-cluster’s test harness no longer rely on the test NTP server to synchronize their built-in NTP client. Instead, the test harness relies on the local machine clock synchronized by the system NTP daemon (see KUDU-2994).

Apache Kudu 1.11.0 Release Notes

Upgrade Notes

  • Since KUDU-2625 is addressed, tablet servers now reject individual write operations which violate schema constraints in a batch of write operations. In prior versions the behavior was to reject the whole batch of write operations if a violation of the schema constraints is detected even for a single row. It’s recommended to revise applications which relied on the behavior mentioned above upon upgrading to Kudu 1.11.0.

Deprecations

  • The Kudu Flume integration is deprecated and may be removed in the next minor release. The integration will be moved to the Apache Flume project going forward (see FLUME-3345).

New features

  • Kudu now supports putting tablet servers into maintenance. While in this mode, the tablet server’s replicas will not be re-replicated if it fails. Only upon exiting maintenance will re-replication be triggered for any remaining under-replicated tablets. The kudu tserver state enter_maintenance and kudu tserver state exit_maintenance tools are added to orchestrate tablet server maintenance, and the kudu tserver list tool is amended with a "state" column option to display current state of each tablet server (see KUDU-2069).

  • Kudu now has a built-in NTP client which maintains the internal wallclock time used for generation of HybridTime timestamps. When enabled, system clock synchronization for nodes running Kudu is no longer necessary. This is useful for containerized deployments and in other cases when it’s troublesome to maintain properly configured system NTP service at each node of a Kudu cluster. The list of NTP servers to synchronize against is specified with the --builtin_ntp_servers flag. By default, Kudu masters and tablet servers use public servers hosted by the NTP Pool project. To use the built-in NTP client, set --time_source=builtin and reconfigure --builtin_ntp_servers if necessary (see KUDU-2935).

  • Aggregated table statistics are now available to Kudu clients via KuduClient.getTableStatistics() and KuduTable.getTableStatistics() methods in the Kudu Java client and KuduClient.GetTableStatistics() in the Kudu C++ client. This allows for various query optimizations. For example, Spark now uses it to perform join optimizations. The statistics are available via the API of both C++ and Java Kudu clients. In addition, per-table statistics are available via kudu table statistics CLI tool. The statistics are also available via master’s Web UI at master:8051/metrics and master:8051/table?id=<uuid> URIs (see KUDU-2797 and KUDU-2921).

  • The kudu CLI tool now supports altering table columns. Use the newly introduced sub-commands such as kudu table column_set_default, kudu table column_remove_default, kudu table column_set_compression, kudu table column_set_encoding, and kudu table column_set_block_size to alter a column of the specified table.

  • The kudu CLI tool now supports dropping table columns. Use the newly introduced kudu table delete_column sub-command to drop a column of the specified table.

  • The kudu CLI tool now supports getting and setting extra configuration properties for a table. Use kudu table get_extra_configs and kudu table set_extra_config sub-commands to perform the corresponding operations (see KUDU-2514).

  • The kudu CLI tool now supports creating and dropping range partitions for a table. Use kudu table add_range_partition and kudu table drop_range_partition sub-commands to perform the corresponding operations (see KUDU-2881).

Optimizations and improvements

  • The kudu fs dump uuid CLI tool is now significantly faster and consumes significantly less IO.

  • The memory consumed by CFileReaders and BloomFileReaders is factored out and accounted separately by the tablet server memory tracking. The stats are available via Web UI as "CFileReaders" and "BloomFileReaders" entries.

  • KuduScanBatch::const_iterator in Kudu C++ client now supports operator→() (see KUDU-1561).

  • Master server Web UI now supports sorting the list of tables by the columns of "Table Name", "Create Time", and "Last Alter Time".

  • Tablet servers now expand a tablet’s data directory group with available healthy directories when all directories of the group are full (see KUDU-2907).

  • For scan operations run with CLOSEST_REPLICA selection mode, the Kudu Java client now picks a random available replica in case no replica is located at the same node with the client that initiated the scan operation. This helps to spread the load generated by multiple scan requests to the same tablet among all available replicas. In prior releases, all such scan requests might end up fetching data from the same tablet replica (see KUDU-2348).

  • The serialization of in-memory rows to Kudu’s wire format has been optimized to be more CPU efficient (see KUDU-2847).

  • Tablet servers and masters can now aggregate metrics by the same attribute. For example, it’s now possible to fetch aggregated metrics from a tablet server by retrieving data from URLs of form http://<host>:<port>/metrics?merge_rules=tablet|table|table_name

  • Introduced Docker image for Python Kudu client (see KUDU-2849).

  • Tablet servers now consider available disk space when choosing a set of data directories for a tablet’s data directory group, and when deciding in which data directory a new block should be written (see KUDU-2901).

  • Added a quick-start example of using Apache Spark to load, query, and modify a real data set stored in Kudu.

  • Added a quick-start example of using Apache Nifi to ingest data into Kudu.

  • Tablet servers now reject individual write operations which violate schema constraints in a batch of write operations received from a client. The previous behavior was to reject the whole batch of write operations if a violation of the schema constraints is detected even for a single row (see KUDU-2625).

  • Tablet replicas can now be optionally placed in accordance with a dimension-based placement policy. To specify a dimension label for a table, use the KuduTableCreator::dimension_label() and CreateTableOptions.setDimensionLabel() methods of the C++ and Java Kudu clients. To add a partition with a dimension label, use the KuduTableAlterer::AddRangePartitionWithDimension() and AlterTableOptions.addRangePartition() methods of the C++ and Java Kudu clients (see KUDU-2823).

  • Kudu RPC now enables TCP keepalive for all outbound connections for faster detection of no-longer-reachable nodes (see KUDU-2192).

  • The kudu table scan and kudu table copy CLI tools now fail gracefully rather than crashing upon hitting an error (see KUDU-2851).

  • Optimized decoding of deltas' timestamps (see KUDU-2867).

  • Optimized the initialization of DeltaMemStore for the case when no matching deltas are present (see KUDU-2381).

  • Improved the rehydration of scan tokens. Now a scan token created before renaming a column can be used even after the column has been renamed.

  • The memory reserved by tcmalloc is now released to OS periodically to avoid potential OOM issues in the case of read-only workloads (see KUDU-2836).

  • Optimized evaluation of predicates on columns of primitive types and NULL/NOT NULL predicates to leverage SIMD instructions (see KUDU-2846).

Fixed Issues

  • Fixed an issue of fault-tolerant scan operation failing for a projection with key columns specified in other than the table schema’s order (see KUDU-2980).

  • Fixed an issue that would cause frequent leader elections in case when persisting Raft transactions to the WAL took longer than the leader election timeout. The issue was contributing to election storms (see KUDU-2947).

  • Fixed a tablet server crash in cases where blocks were not removed due to IO error. This issue may have surfaced after recovering from a disk failure (see KUDU-2635).

  • Fixed a crash in master and tablet server by validating the size of default values when de-serializing ColumnSchemaPB (see KUDU-2622).

  • Fixed RPC negotiation failure in the case when TLS v1.3 is supported at both the client and the server side. This is a temporary workaround before the connection negotiation code is properly updated to support 1.5-RTT handshake used in TLS v1.3. The issue affected Linux distributions shipped or updated with OpenSSL version 1.0.2 and newer (see KUDU-2871).

  • Fixed a race between GetTabletLocations() and tablet report processing. The race could crash the Kudu master (see KUDU-2842).

  • Fixed a bug in AlterSchemaTransactionState::ToString() that led to a crash of tablet server when removing a tablet replica with a pending AlterSchema transaction.

Wire Protocol compatibility

Kudu 1.11.0 is wire-compatible with previous versions of Kudu:

  • Kudu 1.11 clients may connect to servers running Kudu 1.0 or later. If the client uses features that are not available on the target server, an error will be returned.

  • Rolling upgrade between Kudu 1.10 and Kudu 1.11 servers is believed to be possible though has not been sufficiently tested. Users are encouraged to shut down all nodes in the cluster, upgrade the software, and then restart the daemons on the new version.

  • Kudu 1.0 clients may connect to servers running Kudu 1.11 with the exception of the below-mentioned restrictions regarding secure clusters.

The authentication features introduced in Kudu 1.3 place the following limitations on wire compatibility between Kudu 1.11 and versions earlier than 1.3:

  • If a Kudu 1.11 cluster is configured with authentication or encryption set to "required", clients older than Kudu 1.3 will be unable to connect.

  • If a Kudu 1.11 cluster is configured with authentication and encryption set to "optional" or "disabled", older clients will still be able to connect.

Client Library Compatibility

  • The Kudu 1.11 Java client library is API- and ABI-compatible with Kudu 1.10. Applications written against Kudu 1.10 will compile and run against the Kudu 1.11 client library and vice-versa.

  • The Kudu 1.11 C++ client is API- and ABI-forward-compatible with Kudu 1.10. Applications written and compiled against the Kudu 1.10 client library will run without modification against the Kudu 1.11 client library. Applications written and compiled against the Kudu 1.11 client library will run without modification against the Kudu 1.10 client library.

  • The Kudu 1.11 Python client is API-compatible with Kudu 1.10. Applications written against Kudu 1.10 will continue to run against the Kudu 1.11 client and vice-versa.

Known Issues and Limitations

Please refer to the Known Issues and Limitations section of the documentation.

Contributors

Kudu 1.11 includes contributions from 24 people, including 8 first-time contributors:

  • Hannah Nguyen

  • lingbin

  • Ritwik Yadav

  • Scott Reynolds

  • Volodymyr Verovkin

  • Xiaokai Wang

  • Xin He

  • Yao Wang

Thank you for your help in making Kudu even better!

Installation Options

For full installation details, see Kudu Installation.