Support Questions

Find answers, ask questions, and share your expertise

Best practices for securing data ingestion through HDF?

avatar

I am planning to use HDF for a particular used case for ingestion of a lot of flat files and some sensitive metadata from relation databases. In conjunction it will work with HDP 2.4 cluster.

My question is apart from the out of the box security provided by Apache nifi itself what are the other security best practices which should be implemented for HDF.

For more info the HDP cluster will be secured using kerberos, ranger and knox.

Thanks.

1 ACCEPTED SOLUTION

avatar

After my initial research below is what I found about the security options in HDF:

1. To enable the User Interface to be accessed over HTTPS instead of HTTP, the "security properties" heading in the nifi.properties file needs to be edited.

2. The user authentication is aided by the Login Identity Provider which is a pluggable mechanism for authenticating users via their username/password.

a. Login Identity Provider integrates with a Directory Server to authenticate users using LDAP. username/password authentication can be enabled by referencing this provider in nifi.properties.

b. .Login Identity Provider also integrates with a Kerberos Key Distribution Center (KDC) to authenticate users. NiFi can be configured to use Kerberos SPNEGO (or "Kerberos Service") for authentication.

Note: By default NiFi will require client certificates for authenticating users over HTTPS. So explicitly which Login Identity Provider to use needs to be configured in nifi.properties file.

3. Levels of Access in HDF can be controlled by setting up the user of the Authority Provider (Admin) who can then give the corresponding roles to the requesting users.

Below roles are supported:

i) Administrator

ii) Data Flow Manager

iii) Read Only

iv) Provenance

v) NiFi

4. Out of the box NiFi provides several options to encrypt and decrypt the data. The EncryptContent processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as openssl and other data sources and consumers.

Detailed information can be found in HDF documentation:

https://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_AdminGuide/content/ch_administration_guide...

Thanks

View solution in original post

3 REPLIES 3

avatar

@rbiswas

Using the security features of NiFi (like HTTPS transport) is a great way to secure the data in motion. You will want to make sure that the connection from NiFi to the HDP cluster is secured as well (depending on how you do this, possibly with WebHDFS HTTPS transport or Knox). Once the data has landed, you may consider at-rest encryption utilizing the Ranger KMS to provide additional security for the data as well.

avatar

@emaxwell Thank you

avatar

After my initial research below is what I found about the security options in HDF:

1. To enable the User Interface to be accessed over HTTPS instead of HTTP, the "security properties" heading in the nifi.properties file needs to be edited.

2. The user authentication is aided by the Login Identity Provider which is a pluggable mechanism for authenticating users via their username/password.

a. Login Identity Provider integrates with a Directory Server to authenticate users using LDAP. username/password authentication can be enabled by referencing this provider in nifi.properties.

b. .Login Identity Provider also integrates with a Kerberos Key Distribution Center (KDC) to authenticate users. NiFi can be configured to use Kerberos SPNEGO (or "Kerberos Service") for authentication.

Note: By default NiFi will require client certificates for authenticating users over HTTPS. So explicitly which Login Identity Provider to use needs to be configured in nifi.properties file.

3. Levels of Access in HDF can be controlled by setting up the user of the Authority Provider (Admin) who can then give the corresponding roles to the requesting users.

Below roles are supported:

i) Administrator

ii) Data Flow Manager

iii) Read Only

iv) Provenance

v) NiFi

4. Out of the box NiFi provides several options to encrypt and decrypt the data. The EncryptContent processor allows for the encryption and decryption of data, both internal to NiFi and integrated with external systems, such as openssl and other data sources and consumers.

Detailed information can be found in HDF documentation:

https://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_AdminGuide/content/ch_administration_guide...

Thanks