I am able to sucessfully put a file in a non-encrypted zone.
when i try to put a file to an encrypted zone i see this error. The file however is copied to the encrypted zone.
desind@xxxx:~#> hdfs dfs -put users.txt /ccnd/test
18/02/01 06:54:19 WARN kms.LoadBalancingKMSClientProvider: KMS provider at [https://xxxx.com:16000/kms/v1/] threw an IOException [Key retrieval failed.]!!
Caused by: java.lang.NullPointerException: No KeyVersion exists for key 'testTLS1'
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:231)
at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension$DefaultCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:294)
at org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.generateEncryptedKey(KeyProviderCryptoExtension.java:511)
at org.apache.hadoop.crypto.key.kms.server.EagerKeyGeneratorKeyProviderCryptoExtension$CryptoExtension$EncryptedQueueRefiller.fillQueueForKey(EagerKeyGen
eratorKeyProviderCryptoExtension.java:76)
at org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:246)
at org.apache.hadoop.crypto.key.kms.ValueQueue$1.load(ValueQueue.java:240)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
... 54 more
2018-02-02 09:52:50,353 WARN org.apache.hadoop.crypto.key.kms.server.KMS: User hdfs/xxxx.com@VSP.SAS.COM (auth:KERBEROS) request GET https://xxxx.com:16000/kms/v1/key/testTLS1/_eek?num_keys=150&eek_op=generate caused exception.
Can someone advise where to check ?
We have kerberos and SSL enabled in the cluster.
Created 11-13-2018 06:36 AM
@bgooley any ideas where to start debug from?
Created on 11-15-2018 07:49 AM - edited 11-15-2018 07:51 AM
Hi,
From the post it is not clear what type of KMS is deployed on this cluster. As this is a community post I will assume that you are using the JCEKS backed KMS. Please note that the ASF and Cloudera both suggest that this KMS implementation not be used in production deployments.
https://hadoop.apache.org/docs/current/hadoop-kms/index.html
> Caused by: java.lang.NullPointerException: No KeyVersion exists for key 'testTLS1'
The error above occurs when a KMS instance makes a call to getKeyVersion. This call only occurs when the KMS is actually attempting to retrieve a key from the backend key datastore. When you see this error it quite literally means that the requested key cannot be found in the backing key datastore.
If you have more than one KMS for example it likely means that the backing keystore on each KMS is not in sync. The KMS core does not in any way replicate the key material between individual instances at this time. n+1 capabilities are offered through other means in other providers that use external methods to synchronize this data.
If however you believe you have performed a synchronization of the keystores on your own in a safe manner it is possible that one or more backing keystors are corrupt. The KMS will attempt to automatically create a new JCE Key Store when this occurs and export the keys it can. Unfortunately the KMS stores information in these keystores in a format that cannot be manipulate with Keytool. If the automated recovery fails then all data in the JCE keystore is lost and by proxy all keys as well as data are lost.
For the core JCE backed KMS the logging information will appear in /var/log/hadoop-kms. In most cases in order to have a meaningful idea of what is wrong you will need to review the kms.log, kms-audit.log. and the catalina logs. A this time if you are on 6.x tomcat has been replaced by Jetty so the catalina logs will not exists.
The behavior you are seeing suggest that you have more than one KMS instance and the request fails over to a working instance which allows the write to occur. Actual data encryption and decryption occurs on the DFS client. This means that in order for a read or write to occur the client must have a key and the information required to open the read or write pipe.
Created 11-15-2018 08:15 AM
Hi @lhebert
We are using Cloudera licensed KMS. We have manually syncronized .keytrustee folder while setting the KMS. Can we manually sync them again now? and how do we know which one of Active/Passive KMS is corrupted?
Created 11-15-2018 08:32 AM
If you are a licensed customer, using Key Trustee, please open a case immediately. While we would like to help you with this on the community, parts of the diagnostic process on Key Trustee Server and it's clients may expose sensitive information from your environment.
DO NOT arbutrarily attempt to sync the KMS client data again without diagnostics performed by Cloudera!
Syncing the client data again without working with us may result in unexpected data loss. There may be less risk if this is a POC, DEV, or a new environment but at this point in time that is not visible to us.
When we are working with the Key Trustee KMS component and not the Key Server there are no Active or Passive delegations. All configured Key Management Server Proxies are used within the cluster.
Created 11-15-2018 09:10 PM
@lhebert It is turning very interesting....I noticed hadoop key list -provider kms://https@host[1-2]:16000/kms both give different results.
hadoop key create test-key, I cannot see test-key on kms host1 if the request hit kms host2.
Created 11-16-2018 07:11 AM