Created 06-14-2016 08:21 AM
The problem started when in our test environment, people were able to access the data loaded on the cluster via the namenode UI -> Utilities -> Browse file system. The Linux team then has put a firewall block for the port 50070 making the namenode UI inaccessible but also hampering some cluster services.
Now, we are installing a prod. cluster(Ambari 2.2 , HDP 2.4).
The objectives are :
I started the 'Demo LDAP' of the Knox and also checked 'Advanced topology' in Knox configs. Do I have to put the values to secure the respective services e.g: will this ensure that the web UI@50070 ask for credentials ? If yes, what will be the credentials ?
<service> <role>NAMENODE</role> <url>hdfs://{{namenode_host}}:{{namenode_rpc_port}}</url> </service> <service> <role>JOBTRACKER</role> <url>rpc://{{rm_host}}:{{jt_rpc_port}}</url> </service> <service> <role>WEBHDFS</role> <url>http://{{namenode_host}}:{{namenode_http_port}}/webhdfs</url> </service>
Created 06-16-2016 09:53 PM
The purpose of Knox is to provide secure access to cluster REST interfaces by external users. It will not restrict access for users who connect directly to the NameNode web UI without going through Knox. One option is to implement the Knox Gateway, restrict users from accessing the cluster directly (via your choice of infrastructure... firewall, network routing, etc), and have them go through Knox instead. The web UIs will be supported by Knox in the next major HDP release, but many people have successfully used community-contributed services to expose the UIs with the current version of Knox.
Knox typically authenticates against an LDAP directory, so end users would use their credentials from the configured LDAP directory.
To control who has access to HDFS resources you could use Ranger: HDP 2.4 Security Guide - Authorization
If security is a concern then it's highly recommended to secure the cluster using Kerberos. Then an alternative to forcing users to go through Knox would be to enable SPNEGO authentication for the web UIs.