Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is the query result redaction broken in CDH 5.15?

avatar
Explorer

From this document it seems the query results can be redacted however I don't see the same. Is there a known bug?
https://www.hadoopandcloud.com/hadoop/enableconfigure-log-and-query-redaction/

 

0: jdbc:hive2://calingita.pheonix.co> select * from pheonix_global.redaction;
INFO : Compiling command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103): select * from pheonix_global.redaction;
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:redaction.id, type:int, comment:null), FieldSchema(name:redaction.email, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103); Time taken: 0.322 seconds
INFO : Executing command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103): select * from pheonix_global.redaction
INFO : Completed executing command(queryId=hive_20201005072828_53b14d1f-5f0a-4ce6-894d-debfeade3103); Time taken: 0.006 seconds
INFO : OK
+--------------------+-----------------------+--+
| redaction.id | redaction.email |
+--------------------+-----------------------+--+
| 2 | xyz@liv.in |
| 1 | xyz@liv.in |
+--------------------+-----------------------+--+
2 rows selected (1.075 seconds)

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@saurabh707 

 

You are expecting the query results to be redacted which will not happen in CDH. Let me explain you with a simple example.

0: jdbc:hive2://host-10-17-102-168.coe.cloude> select * from redaction_test where email='tushark@gmail.com';

INFO  : Completed compiling command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53); Time taken: 0.214 seconds

INFO  : Executing command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53): select * from redaction_test where email='email@redacted.host' ---->Look here, the data is redacted

+--------------------+-----------------------+--+

| redaction_test.id  | redaction_test.email  |

+--------------------+-----------------------+--+

| 1                  | tushark@gmail.com     |---> The results will not be redacted

+--------------------+-----------------------+--

-If you are expecting the results to be redacted, this will not happen in CDH but if you look at the logs, the sensitive data is redacted as pointed above. You can read the below the blog to check the same.
https://blog.cloudera.com/new-in-cdh-5-4-sensitive-data-redaction/

However, if you want the results to be redacted, we have CDP coming into picture where you can redact the query results with the help of Ranger. Checkout more on the same:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/security-how-to-guides/topics/cm-security-red...

View solution in original post

1 REPLY 1

avatar
Expert Contributor

@saurabh707 

 

You are expecting the query results to be redacted which will not happen in CDH. Let me explain you with a simple example.

0: jdbc:hive2://host-10-17-102-168.coe.cloude> select * from redaction_test where email='tushark@gmail.com';

INFO  : Completed compiling command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53); Time taken: 0.214 seconds

INFO  : Executing command(queryId=hive_20201007212727_0b73be29-c01b-4d47-a9f3-fef9cf470e53): select * from redaction_test where email='email@redacted.host' ---->Look here, the data is redacted

+--------------------+-----------------------+--+

| redaction_test.id  | redaction_test.email  |

+--------------------+-----------------------+--+

| 1                  | tushark@gmail.com     |---> The results will not be redacted

+--------------------+-----------------------+--

-If you are expecting the results to be redacted, this will not happen in CDH but if you look at the logs, the sensitive data is redacted as pointed above. You can read the below the blog to check the same.
https://blog.cloudera.com/new-in-cdh-5-4-sensitive-data-redaction/

However, if you want the results to be redacted, we have CDP coming into picture where you can redact the query results with the help of Ranger. Checkout more on the same:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/security-how-to-guides/topics/cm-security-red...