- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Secure Webhdfs in Hadoop Hortonworks Cluster
Created ‎01-27-2020 08:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear community
I have installed a hadoop cluster on 8 servers using Ambari Hortonworks.
I am able to access webhdfs using the ip address and the default port 50070 without authentication.
How can I secure Webhdfs?
P.S I did not enable using kerberos in Ambari > Enable kerberos , should I do it?
Any suggestion will be appreciated
Created ‎02-03-2020 01:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good to know that your original issue is resolved. However for any subsequent slightly different issue it is always better to open a new Community Thread that way the readers of this thread can easily find out One Error/Issue with one Solution. Multiple issues in a single thread can cause readers to get confused.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎01-28-2020 12:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please refer to the following doc in order to know how you can enable SPNEGO authentication. Once you have enabled Kerberos for your cluster after that you can also enable the SPNEGO authentication. The following doc explains how to configure HTTP authentication for Hadoop components in a Kerberos environment.
By default, access to the HTTP-based services and UIs for the cluster are not configured to require authentication.
1. https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/authentication-with-kerberos/content/authe_spn...
2. https://docs.cloudera.com/HDPDocuments/Ambari-
Created ‎01-30-2020 09:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thak you for your help
I tried to restart th ambari server but in vain .
I got this error
2020-01-30 18:20:21,866 INFO [main] KerberosChecker:64 - Checking Ambari Server Kerberos credentials.
2020-01-30 18:20:22,052 ERROR [main] KerberosChecker:120 - Client not found in Kerberos database (6)
2020-01-30 18:20:22,052 ERROR [main] AmbariServer:1119 - Failed to run the Ambari Server
org.apache.ambari.server.AmbariException: Ambari Server Kerberos credentials check failed.
Check KDC availability and JAAS configuration in /etc/ambari-server/conf/krb5JAASLogin.conf
at org.apache.ambari.server.controller.utilities.KerberosChecker.checkJaasConfiguration(KerberosChecker.java:121)
at org.apache.ambari.server.controller.AmbariServer.main(AmbariServer.java:1110)
ht JAASLogin is configured like this
com.sun.security.jgss.krb5.initiate {
com.sun.security.auth.module.Krb5LoginModule required
I tried to follow these links
Any suggestion please?
Created ‎01-30-2020 04:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As we see the error like:
Failed to run the Ambari Server
org.apache.ambari.server.AmbariException: Ambari Server Kerberos credentials check failed.
Check KDC availability and JAAS configuration in /etc/ambari-server/conf/krb5JAASLogin.conf
1. So can you please let us know how did you enable Kerberos for Ambari Server ? or manually?
2. Do you have ambari-agent installed on the ambari server host? and Do you have the Kerberos clients installed on the ambari server host?
# yum info krb5-libs
# yum info krb5-workstation
3. Do you have the correct KDC/AD address defined inside the file :
# ps -ef | grep AmbariServer | grep --color krb5.conf
# cat /etc/krb5.conf
4. Are you able to do "kinit" to get a valid kerberos ticket using the same detail mentioned in the file "/etc/ambari-server/conf/krb5JAASLogin.conf"
# kinit -kt /etc/security/ambariservername.keytab ambariservername@REALM.COM
# klist
Created ‎01-31-2020 08:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot 🙂
I have configured the Cluster with Kerberos using Active Directory
but i got some issues when connecting
[root@server keytabs]# hdfs dfs -ls /
20/01/31 16:31:19 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
ls: DestHost:destPort namenode:8020 , LocalHost:localPort ambari/ip:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
Any idea please?
looks like the 8020 ports is also blocked
Created ‎01-31-2020 03:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In order to clarify the port access, From Ambari host please check if the NameNode port and address is accessible?
The error which you posted usually indicates that before running the mentioned HDFS command you did not get a Valid kerberos ticket using "kinit" command.
20/01/31 16:31:19 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
Most Possible Cause of above WARNING:
If the port is accessible then please check if you are able to run the same hdfs command after getting a valid kerberos ticket.
# klist -kte /etc/security/ambariservername.keytab
# kinit -kt /etc/security/ambariservername.keytab ambariservername@REALM.COM
# klist
# hdfs dfs -ls /
And then try the same command using the "hdfs" headless keytab
# kdestroy
# klist -kte /etc/security/keytabs/hdfs.headless.keytab
# kinit -kt /etc/security/keytabs/hdfs.headless.keytab hdfs-ker1latest@EXAMPLE.COM
# klist
# hdfs dfs -ls /
*NOTE:* the "hdfs-ker1latest@EXAMPLE.COM" principal name may be different in your case so replace it with your own hdfs keytab principle
Please share the output of the above commands.
Also verify if all your cluster nodes has correct FQDN.
Created ‎02-03-2020 06:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot
Now the problem for hdfs is fixed however when i try to launch a script from an edge node , i am getting the same issue
/usr/hdp/ --class org.apache.spark.examples.SparkPi --master spark://edgenode.servername:7077 --num-executors 4 --driver-memory 512m --executor-memory 512m --executor-cores 1 /usr/hdp/
Results :
20/02/03 15:13:41 INFO StandaloneAppClient$ClientEndpoint: Executor updated: app-20200203151341-0000/79 is now RUNNING
20/02/03 15:13:41 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69cac930{/metrics/json,null,AVAILABLE,@Spark}
20/02/03 15:13:42 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
20/02/03 15:13:42 ERROR SparkContext: Error initializing SparkContext.
java.io.IOException: DestHost:destPort namenode.servername:8020 , LocalHost:localPort edgenodeaddress:0. Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1444)
at org.apache.hadoop.ipc.Client.call(Client.java:1354)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:900)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1660)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1577)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1574)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1589)
at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:100)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:522)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2498)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:934)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:925)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:925)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:758)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:721)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:814)
at org.apache.hadoop.ipc.Client$Connection.access$3600(Client.java:411)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1559)
at org.apache.hadoop.ipc.Client.call(Client.java:1390)
Created ‎02-03-2020 08:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually, for more details:
In my ambari server machine I have this ticket:
[root@ambariserver ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: spark-analytics_hadoop@REALM.COM
Valid starting Expires Service principal
02/03/2020 13:31:21 02/03/2020 23:31:21 krbtgt/REALM.COM@REALM.COM
renew until 02/10/2020 13:31:21
When i connect with spark user :
HADOOP_ROOT_LOGGER=DEBUG,console /usr/hdp/ --class org.apache.spark.examples.SparkPi --master spark://Edgenode:7077 --num-executors 4 --driver-memory 512m --executor-memory 512m --executor-cores 1 /usr/hdp/
=> OK
Now if I connect from the Edge Node
[root@EdgeNode~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: spark/EdgeNode@REALM.COM
Valid starting Expires Service principal
02/03/2020 16:52:12 02/04/2020 02:52:12 krbtgt/REALM.COM@REALM.COM
renew until 02/10/2020 16:52:12
But when I connect with user spark
HADOOP_ROOT_LOGGER=DEBUG,console /usr/hdp/ --class org.apache.spark.examples.SparkPi --master spark://Edgenode:7077 --num-executors 4 --driver-memory 512m --executor-memory 512m --executor-cores 1 /usr/hdp/
=> I got error :
20/02/03 17:53:01 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69cac930{/metrics/json,null,AVAILABLE,@Spark}
20/02/03 17:53:01 WARN Client: Exception encountered while connecting to the server : org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
20/02/03 17:53:01 ERROR SparkContext: Error initializing SparkContext.
java.io.IOException: DestHost:destPort NameNode:8020 , LocalHost:localPort EdgeNode/ Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:4
Did I miss something please?
Users from their laptop launch this commands
cluster = RxSpark(sshHostname = "EdgeNode", sshUsername = "username")
source = c("~/AirlineDemoSmall.csv")
dest_file = "/share"
They are getting thr same issue
On all node cluster hdfs dfs -ls / is working well
Please advise
Created ‎02-03-2020 09:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
should i create principle for each user in the AD ?
We are using active directory users?
If yes how so?
Many thanks
Created ‎02-03-2020 01:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good to know that your original issue is resolved. However for any subsequent slightly different issue it is always better to open a new Community Thread that way the readers of this thread can easily find out One Error/Issue with one Solution. Multiple issues in a single thread can cause readers to get confused.
If your question is answered then, Please make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.