Created 10-06-2020 07:27 AM
From this document it seems the query results can be redacted however I don't see the same. Is there a known bug?
https://www.hadoopandcloud.com/hadoop/enableconfigure-log-and-query-redaction/
0: jdbc:hive2://calingita.pheonix.co> select * from pheonix_global.redaction;
INFO : Compiling command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103): select * from pheonix_global.redaction;
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:redaction.id, type:int, comment:null), FieldSchema(name:redaction.email, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103); Time taken: 0.322 seconds
INFO : Executing command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103): select * from pheonix_global.redaction
INFO : Completed executing command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103); Time taken: 0.006 seconds
INFO : OK
+--------------------+-----------------------+--+
| redaction.id | redaction.email |
+--------------------+-----------------------+--+
| 2 | xyz@liv.in |
| 1 | xyz@liv.in |
+--------------------+-----------------------+--+
2 rows selected (1.075 seconds)
Created 10-07-2020 09:34 PM
You are expecting the query results to be redacted which will not happen in CDH. Let me explain you with a simple example.
0: jdbc:hive2://host-10-17-102-168.coe.cloude> select * from redaction_test where email='tushark@gmail.com';
INFO : Completed compiling command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53); Time taken: 0.214 seconds
INFO : Executing command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53): select * from redaction_test where email='email@redacted.host' ---->Look here, the data is redacted
+--------------------+-----------------------+--+
| redaction_test.id | redaction_test.email |
+--------------------+-----------------------+--+
| 1 | tushark@gmail.com |---> The results will not be redacted
+--------------------+-----------------------+--
-If you are expecting the results to be redacted, this will not happen in CDH but if you look at the logs, the sensitive data is redacted as pointed above. You can read the below the blog to check the same.
https://blog.cloudera.com/new-in-cdh-5-4-sensitive-data-redaction/
However, if you want the results to be redacted, we have CDP coming into picture where you can redact the query results with the help of Ranger. Checkout more on the same:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/security-how-to-guides/topics/cm-security-red...
Created 10-07-2020 09:34 PM
You are expecting the query results to be redacted which will not happen in CDH. Let me explain you with a simple example.
0: jdbc:hive2://host-10-17-102-168.coe.cloude> select * from redaction_test where email='tushark@gmail.com';
INFO : Completed compiling command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53); Time taken: 0.214 seconds
INFO : Executing command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53): select * from redaction_test where email='email@redacted.host' ---->Look here, the data is redacted
+--------------------+-----------------------+--+
| redaction_test.id | redaction_test.email |
+--------------------+-----------------------+--+
| 1 | tushark@gmail.com |---> The results will not be redacted
+--------------------+-----------------------+--
-If you are expecting the results to be redacted, this will not happen in CDH but if you look at the logs, the sensitive data is redacted as pointed above. You can read the below the blog to check the same.
https://blog.cloudera.com/new-in-cdh-5-4-sensitive-data-redaction/
However, if you want the results to be redacted, we have CDP coming into picture where you can redact the query results with the help of Ranger. Checkout more on the same:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/security-how-to-guides/topics/cm-security-red...