Note: This is a cross-post from the Cloudera Engineering Blog Testing Apache Kudu Applications on the JVM
Although the Kudu server is written in C++ for performance and efficiency, developers can write client applications in C++, Java, or Python. To make it easier for Java developers to create reliable client applications, we’ve added new utilities in Kudu 1.9.0 that allow you to write tests using a Kudu cluster without needing to build Kudu yourself, without any knowledge of C++, and without any complicated coordination around starting and stopping Kudu clusters for each test. This post describes how the new testing utilities work and how you can use them in your application tests.
Note: It is possible this blog post could become outdated – for the latest documentation on using the JVM testing utilities see the Kudu documentation.
In order to use the new testing utilities, the following requirements must be met:
- macOS El Capitan (10.11) or later
- CentOS 6.6+, Ubuntu 14.04+, or another recent distribution of Linux supported by Kudu
- Java 8+
- Note: Java 7+ is deprecated, but still supported
- Build Tool
In order to use the Kudu testing utilities, add two dependencies to your classpath:
kudu-test-utils dependency has useful utilities for testing applications that use Kudu.
Primarily, it provides the
to manage the lifecycle of a Kudu cluster for each test. The
KuduTestHarness is a
that not only starts and stops a Kudu cluster for each test, but also has methods to manage the
cluster and get pre-configured
KuduClient instances for use while testing.
kudu-binary dependency contains the native Kudu (server and command-line tool) binaries for
the specified operating system. In order to download the right artifact for the running operating
system it is easiest to use a plugin, such as the
osdetector-gradle-plugin, to detect the
current runtime environment. The
KuduTestHarness will automatically find and use the
jar on the classpath.
kudu-binary module should only be used to run Kudu for integration testing purposes.
It should never be used to run an actual Kudu service, in production or development, because the
kudu-binary module includes native security-related dependencies that have been copied from the
build system and will not be patched when the operating system on the runtime host is patched.
If you are using Maven to build your project, add the following entries to your project’s
If you are using Gradle to build your project, add the following entries to your project’s
Once your project is configured correctly, you can start writing tests using the
kudu-binary artifacts. One line of code will ensure that each test automatically starts and
stops a real Kudu cluster and that cluster logging is output through
The KuduTestHarness has methods to get pre-configured clients, start and stop servers, and more. Below is an example test to showcase some of the capabilities:
For a complete example of a project using the
KuduTestHarness, see the
java-example project in
the Kudu source code repository. The Kudu project itself uses the
KuduTestHarness for all of its
own integration tests. For more complex examples, you can explore the various
tests in the Kudu source code repository.
Kudu 1.9.0 is the first release to have these testing utilities available. Although these utilities simplify testing of Kudu applications, there is always room for improvement. Please report any issues, ideas, or feedback to the Kudu user mailing list, Jira, or Slack channel and we will try to incorporate your feedback quickly. See the Kudu community page for details.
We would like to give a special thank you to everyone who helped contribute to the
kudu-binary artifacts. We would especially like to thank
Brian McDevitt at phData
Tim Robertson at GBIF who helped us