Kudu can be configured to enforce secure authentication among servers, and
between clients and servers. Authentication prevents untrusted actors from
gaining access to Kudu, and securely identifies the connecting user or services
for authorization checks. Authentication in Kudu is designed to interoperate
with other secure Hadoop components by utilizing Kerberos.
Authentication can be configured on Kudu servers using the
--rpc-authentication flag, which can be set to
disabled. By default, the flag is set to
will reject connections from clients and servers who lack authentication
optional, Kudu will attempt to use strong authentication,
but will allow unauthenticated connections. When
disabled, Kudu will only
allow unauthenticated connections.
--rpc-authentication flag is set to
the cluster does not prevent access from unauthenticated users. To secure a
Kudu uses an internal PKI system to issue X.509 certificates to servers in
the cluster. Connections between peers who have both obtained certificates will
use TLS for authentication, which doesn’t require contacting the Kerberos KDC.
These certificates are only used for internal communication among Kudu
servers, and between Kudu clients and servers. The certificates are never
presented in a public facing protocol.
By using internally-issued certificates, Kudu offers strong authentication which
scales to huge clusters, and allows TLS encryption to be used without requiring
you to manually deploy certificates on every node.
After authenticating to a secure cluster, the Kudu client will automatically
request an authentication token from the Kudu master. An authentication token
encapsulates the identity of the authenticated user and carries the master’s
RSA signature so that its authenticity can be verified.
This token will be used to authenticate subsequent connections. By default,
authentication tokens are only valid for seven days, so that even if a token
were compromised, it could not be used indefinitely. For the most part,
authentication tokens should be completely transparent to users. By using
authentication tokens, Kudu takes advantage of strong authentication without
paying the scalability cost of communicating with a central authority for every
When used with distributed compute frameworks such as Spark, authentication
tokens can simplify configuration and improve security. For example, the Kudu
Spark connector will automatically retrieve an authentication token during the
planning stage, and distribute the token to tasks. This allows Spark to work
against a secured Kudu cluster where only the planner node has Kerberos