Created 03-22-2017 09:53 AM
Scenario as below
1) Have two tables A and B , having the column Account_Number.
2) If we did the join it should run fine and give result with Account_Number.
3) However this Account_number to be masked in out put ( its critical data ).
4) we have get out put, Account_number should not display all data, its should be like XXXXXX12345.
Solition 1: In Ranger we can disable the Account_number colum name, if we did it join objeration will fail.
Join has to run on Account_number only and this data need to masked ?
Is it possible to do.
Your concerns are appriciated.
Created 03-23-2017 06:22 PM
@zkfs, mask_show_last_n udf can be used for the scenario you mentioned.
You can run this query from the beeline to get additional details about udf:
DESCRIBE FUNCTION extended mask_show_last_n;
+---------------------------------------------------------------------------------------------------------------------------------+--+
| tab_name |
+---------------------------------------------------------------------------------------------------------------------------------+--+
| masks all but last n characters of the value |
| Examples: |
| mask_show_last_n(ccn, 😎 |
| mask_show_last_n(ccn, 8, 'x', 'x', 'x') |
| Arguments: |
| mask_show_last_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) |
| value - value to mask. Supported types: TINYINT, SMALLINT, INT, BIGINT, STRING, VARCHAR, CHAR |
| charCount - number of characters. Default value: 4 |
| upperChar - character to replace upper-case characters with. Specify -1 to retain original character. Default value: 'X' |
| lowerChar - character to replace lower-case characters with. Specify -1 to retain original character. Default value: 'x' |
| digitChar - character to replace digit characters with. Specify -1 to retain original character. Default value: 'n' |
| otherChar - character to replace all other characters with. Specify -1 to retain original character. Default value: -1 |
| numberChar - character to replace digits in a number with. Valid values: 0-9. Default value: '1' |
| NULL |
+---------------------------------------------------------------------------------------------------------------------------------+--+
Following query returns string with all the characters masked except last 4:
select mask_show_last_n('AAbb1234567', 4, 'X', 'X', 'X'); +--------------+--+ | _c0 | +--------------+--+ | XXXXXXX4567 | +--------------+--+ 1 row selected (0.085 seconds
Created 03-22-2017 05:47 PM
You can use data masking udfs instead of disabling the column.
You can find more details from here:
Created 03-23-2017 09:23 AM
thanks for your reply.
Have you implemented the UDF for this Masking the columnn, if yes please share sample code..
Thank you
Created 03-23-2017 06:22 PM
@zkfs, mask_show_last_n udf can be used for the scenario you mentioned.
You can run this query from the beeline to get additional details about udf:
DESCRIBE FUNCTION extended mask_show_last_n;
+---------------------------------------------------------------------------------------------------------------------------------+--+
| tab_name |
+---------------------------------------------------------------------------------------------------------------------------------+--+
| masks all but last n characters of the value |
| Examples: |
| mask_show_last_n(ccn, 😎 |
| mask_show_last_n(ccn, 8, 'x', 'x', 'x') |
| Arguments: |
| mask_show_last_n(value, charCount, upperChar, lowerChar, digitChar, otherChar, numberChar) |
| value - value to mask. Supported types: TINYINT, SMALLINT, INT, BIGINT, STRING, VARCHAR, CHAR |
| charCount - number of characters. Default value: 4 |
| upperChar - character to replace upper-case characters with. Specify -1 to retain original character. Default value: 'X' |
| lowerChar - character to replace lower-case characters with. Specify -1 to retain original character. Default value: 'x' |
| digitChar - character to replace digit characters with. Specify -1 to retain original character. Default value: 'n' |
| otherChar - character to replace all other characters with. Specify -1 to retain original character. Default value: -1 |
| numberChar - character to replace digits in a number with. Valid values: 0-9. Default value: '1' |
| NULL |
+---------------------------------------------------------------------------------------------------------------------------------+--+
Following query returns string with all the characters masked except last 4:
select mask_show_last_n('AAbb1234567', 4, 'X', 'X', 'X'); +--------------+--+ | _c0 | +--------------+--+ | XXXXXXX4567 | +--------------+--+ 1 row selected (0.085 seconds