Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Whats is the default algorithm/implementation for the "hadoop credential create" command?

avatar
Contributor

You can create a password alias using the following example hadoop command:

hadoop credential create pwalias -provider jceks://hdfs/tmp/test.jceks

It prompts you to enter a password and stores it in the provider location.

Question: What is the default encryption or hashing algorithm it uses to store your password? I looked but I did not find any documentation for this or the default credential provider implementation used by hadoop. If you know can you please share any information or point in the right direction. Needed for a security audit.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

The JCEKS credential provider leverages the Sun/Oracle proprietary keystore format to protect the credentials within the store. The proprietary algorithm used by Sun/Oracle is based on password based encryption but uses 3-key triple DES (instead of DES) in CBC mode with PKCS #5 padding. This has an effective cryptographic strength of 112 bits, although the key is 168 bits plus 24 parity bits for a total of 192 bits.

This key (and the initialization vector) is derived from a password using a proprietary MD5-based algorithm. Normally, deriving the initialization vector from the key would defeat the purpose, but each entry also has a unique salt for key derivation. This means that the derived key and initialization vector are unique to to each entry.

The JCEKS provider does require a password for the keystore. There are a couple ways to specify this password:

1. Environment variable

2. Password file with file permissions - location specified within configuration

3. Default password of "none" with keystore protected with file permissions

The first two are perfectly viable but each require that both the credential administrator and the runtime consumer of the credential have access to the same keystore password. This is usually non-trivial. In addition, #2 is solely dependent on file permissions and therefore the credential is in clear text in the password file.

#3 has a hardcoded password but this means that it is available to both administrator and consumer and the credential is not stored in clear text. Combined with appropriate file permissions, this approach is arguably the best for using the JCEKS provider.

It is important to understand that the Credential Provider API is a pluggable API and other providers can be implemented in order to have a more secure approach. A credential server that authenticated the requesting user instead of requiring a password for instance would be a good way to remove the keystore password issue.

Incidentally, there are Apache docs for this which will be published once Hadoop 2.8.3 and later are released.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

The JCEKS credential provider leverages the Sun/Oracle proprietary keystore format to protect the credentials within the store. The proprietary algorithm used by Sun/Oracle is based on password based encryption but uses 3-key triple DES (instead of DES) in CBC mode with PKCS #5 padding. This has an effective cryptographic strength of 112 bits, although the key is 168 bits plus 24 parity bits for a total of 192 bits.

This key (and the initialization vector) is derived from a password using a proprietary MD5-based algorithm. Normally, deriving the initialization vector from the key would defeat the purpose, but each entry also has a unique salt for key derivation. This means that the derived key and initialization vector are unique to to each entry.

The JCEKS provider does require a password for the keystore. There are a couple ways to specify this password:

1. Environment variable

2. Password file with file permissions - location specified within configuration

3. Default password of "none" with keystore protected with file permissions

The first two are perfectly viable but each require that both the credential administrator and the runtime consumer of the credential have access to the same keystore password. This is usually non-trivial. In addition, #2 is solely dependent on file permissions and therefore the credential is in clear text in the password file.

#3 has a hardcoded password but this means that it is available to both administrator and consumer and the credential is not stored in clear text. Combined with appropriate file permissions, this approach is arguably the best for using the JCEKS provider.

It is important to understand that the Credential Provider API is a pluggable API and other providers can be implemented in order to have a more secure approach. A credential server that authenticated the requesting user instead of requiring a password for instance would be a good way to remove the keystore password issue.

Incidentally, there are Apache docs for this which will be published once Hadoop 2.8.3 and later are released.

avatar
Contributor

Thank you @lmccay for the response and detailed explanation. Highly appreciated.