Support Questions
Find answers, ask questions, and share your expertise

How do you achieve high availability in HDFS when Ranger KMS is down or the metastore is down?

Solved Go to solution

How do you achieve high availability in HDFS when Ranger KMS is down or the metastore is down?

Expert Contributor
 
1 ACCEPTED SOLUTION

Accepted Solutions

Re: How do you achieve high availability in HDFS when Ranger KMS is down or the metastore is down?

Scenario 1: Ranger KMS DB is down but Node is Up

  1. The keys are cached for a time. You can still read the data in the encrypted folder. HDFS has knowledge of the encryption zone key
  2. I assume that The Ranger KMS Service is still up, while the DB/ metastore is down.
  3. If you know the database cannot be recovered, and you don¹t have a back up of the keystore, you immediately begin to remove the encryption zone.
  4. You log in as an authorized user, or hdfs and begin copying the files to an unencrypted area and then remove the encrypted zone.
  5. I just tested this on my cluster

Scenario 2: The entire node was down. This means BOTH the Ranger DB and the Ranger KMS Service is down.

  1. The Encryption Zone key is the Ranger KMS DB (Metastore) and you can also export and save to a file.
  2. You should back up and also make the Ranger KMS DB highly available.
  3. Once you export to a keystore file, you back up the file.
  4. If the cluster node goes down, you restore the Ranger KMS DB again from backup.
  5. If you cannot restore Ranger KMS DB from back up, you create a completely new Ranger KMS Db and get the backup Keystore file and as a special user run a script to import the key back to the newly created database.
  6. You can associate once again the encryption zone folder with the key using HDFS commands.
  7. If you Don¹t have BOTH the Keystore file and the Ranger KMS DB to restore then you don¹t have any option. The file remains encrypted.

See this article for script to export and import keys:

https://community.hortonworks.com/articles/51909/how-to-copy-encrypted-data-between-two-hdp-cluster....

View solution in original post

2 REPLIES 2

Re: How do you achieve high availability in HDFS when Ranger KMS is down or the metastore is down?

Scenario 1: Ranger KMS DB is down but Node is Up

  1. The keys are cached for a time. You can still read the data in the encrypted folder. HDFS has knowledge of the encryption zone key
  2. I assume that The Ranger KMS Service is still up, while the DB/ metastore is down.
  3. If you know the database cannot be recovered, and you don¹t have a back up of the keystore, you immediately begin to remove the encryption zone.
  4. You log in as an authorized user, or hdfs and begin copying the files to an unencrypted area and then remove the encrypted zone.
  5. I just tested this on my cluster

Scenario 2: The entire node was down. This means BOTH the Ranger DB and the Ranger KMS Service is down.

  1. The Encryption Zone key is the Ranger KMS DB (Metastore) and you can also export and save to a file.
  2. You should back up and also make the Ranger KMS DB highly available.
  3. Once you export to a keystore file, you back up the file.
  4. If the cluster node goes down, you restore the Ranger KMS DB again from backup.
  5. If you cannot restore Ranger KMS DB from back up, you create a completely new Ranger KMS Db and get the backup Keystore file and as a special user run a script to import the key back to the newly created database.
  6. You can associate once again the encryption zone folder with the key using HDFS commands.
  7. If you Don¹t have BOTH the Keystore file and the Ranger KMS DB to restore then you don¹t have any option. The file remains encrypted.

See this article for script to export and import keys:

https://community.hortonworks.com/articles/51909/how-to-copy-encrypted-data-between-two-hdp-cluster....

View solution in original post

Re: How do you achieve high availability in HDFS when Ranger KMS is down or the metastore is down?

Expert Contributor

Thanks my friend!