In Hadoop Security we uses multiple tools like Ranger, Knox, Sentry, Kerberos, TDE (Transparent Encryption in HDFS) , HDFS ACL's what are the key differences and what can be use to where , how we can decide.
- HDFS ACL: you manage by your self the right access on HDFS, it can be very quickly become a huge amount of work if you have a lot of user. More over it protect the access only to hdfs
- HDFS TDE (encryption): This is a feature of HDFS that encrypt in a completly transparent manner all the file in a folder. This provide strong protection to any data store on HDFS. It can be from Hive, HBase, etc...
- Ranger: The most interesting part! It's a tool that help you to configure that to manage the access to the different hadoop component. For example, you can create policy for a specific group of user define in your enterprise AD to have only read or write or access denied to HDFS folder. It can also restrict access to: Hive, Hbase, Solr, kafka, etc... Ranger is really powerfull and help to manage the security by reduce the time needed to do it. More over it provide an audi feature, if it's enable you can see who had access to what and when (you can also see if the permission was denied).
- Knox: you can see Know as a kind of proxy. Every request from every user will be sent to the Knox server and he will redirect the request to the correct service/server. It's useful if you don't want that your user know the network topology of your cluster and if you don't want that they have a direct access to the server hosting the service (like Hive database)
I would recommend to use the combination ranger+ Knox and if you have the need to use TDE enrcyption.