Hello Divya,
The purpose of Sensitive Data Redaction is to sanitize log files and query history or any other activitity that are stored outside of the database. It is not applied to the actual data in your database.
When data redaction is enabled, the following data is redacted:
- Logs in HDFS and any dependent cluster services.
- Audit data sent to Cloudera Navigator
- SQL query strings displayed by Hue, Hive, and Impala.
For example, if you set the search rules to replace "\d{3}[^\w]\d{2}[^\w]\d{4}" with "XXX-XX-XXXX" and, as a user authorized to access the table "employees", you run the following query:
SELECT * FROM employees
WHERE ssn = '123-45-6789'
The query will return the data you requested from the database. The data redaction will be applied to the query history in Hue which will save the query as:
SELECT * FROM employees
WHERE ssn = 'XXX-XX-XXXX'
If your database contains sensitive information, to protect it:
- ensure the permissions on the database are set to only allow access only to authorized users
or
- have the data sanitized before it is loaded into your database.
Please see the blog post at https://blog.cloudera.com/blog/2015/06/new-in-cdh-5-4-sensitive-data-redaction/ and the documentation at https://www.cloudera.com/documentation/enterprise/5-8-x/topics/sg_redaction.html for more information.
Thanks!
David Wilder, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:Terms of Service
Community Guidelines
How to use the forum