New Contributor
Posts: 3
Registered: ‎10-03-2018

Encryption of data during sqoop load

[ Edited ]



I work for a company that has a CDH 5.16 Express cluster. One of our analysts is wanting to take a MySQL table (CRMID, Email) and load it into HDFS. During the load process we need to encrypt the Email addresses with a known salt, so the Impala table looks like; CRMID, EmailHash. This is for GDPR.


If we were to store the salt in the credential provider API, would sqoop be able to encrypt the email addresses? If not, does the Hadoop ecosystem have a method for this?


If not we're going to have to import the data via our SQL Server estate, which adds a dependency I'd rather avoid.