Created 06-28-2016 10:13 PM
Data comes from multiple sources and these are exposed in the hive table for the users. A specific column is sensitive and needs to be given restricted access. If a user who wants to join 2 such tables on the column that he does not have access to, then whats the best approach to make it work? One option is to link the sensitive column with a generated key so that the user can join on the generated key. Is this a good idea or any better idea?
Created 06-29-2016 12:36 PM
You could create a hashed column for that key and choose the hash algorithm in a way that a hash collision is very unlikely. However I don't get the usecase completely. Couldn't you just create a new View that already joins the two tables together and only give access to the resulting columns?
Table A ( customers web site a )
Name, Address, CreditCard
Table B ( customers web site b )
Name, Address, CreditCard
Create View C as
SELECT NameA,AddressA, NameB, AddressB from A,B where a.CreditCard = B.CreditCard;
And only give access to that view. I know this doesn't give the same flexibility but you do not need to do the whole hash thing.
If more flexibility is desired then your proposed approach of adding a masked column to both tables would be the way to go. I would think something like sha2 or the aes_encrypt function should provide a way to be very secure in avoiding hash collisions.
Created 06-29-2016 12:36 PM
You could create a hashed column for that key and choose the hash algorithm in a way that a hash collision is very unlikely. However I don't get the usecase completely. Couldn't you just create a new View that already joins the two tables together and only give access to the resulting columns?
Table A ( customers web site a )
Name, Address, CreditCard
Table B ( customers web site b )
Name, Address, CreditCard
Create View C as
SELECT NameA,AddressA, NameB, AddressB from A,B where a.CreditCard = B.CreditCard;
And only give access to that view. I know this doesn't give the same flexibility but you do not need to do the whole hash thing.
If more flexibility is desired then your proposed approach of adding a masked column to both tables would be the way to go. I would think something like sha2 or the aes_encrypt function should provide a way to be very secure in avoiding hash collisions.
Created 06-29-2016 01:00 PM
@Benjamin Leonhardi Thanks, makes sense