Created 02-26-2016 02:51 AM
I enabled kerberos on HDP 2.3.2 cluster using ambari 2.1.2.1 and then tried to run map reduce job on the edge node as a local user but the job failed:
Error Message:
Diagnostics: Application application_1456454501315_0001 initialization failed (exitCode=255) with output: main : command provided 0
main : run as user is xxxxx
main : requested yarn user is xxxxx
User xxxxx not found Failing this attempt. Failing the application. 16/02/25 18:42:28 INFO mapreduce.Job: Counters: 0 Job Finished in 7.915 seconds
My understanding is that we don't need the edge node local user anywhere else.. but I am not sure why my map reduce job is failing due to the user not being there on other nodes. please help
example mapreduce job:
XXXXX:~#yarn jar /usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-mapreduce-examples-2.7.1.2.3.2.0-2950.jar pi 16 100000
Created 02-26-2016 11:19 PM
jobs are running fine after i added the user to hadoop group on all the nodes .. but i am not sure adding the user account to the hadoop group would be a good idea ..
Created 02-26-2016 08:35 PM
First check the java version and java version path env variable if not the same ten create soft link $JAVA_HOM/bin/java to /usr/bin/java
#java -version
Cross check steps for reference: [root@sandbox ~]# which java
/usr/lib/jvm/java-1.7.0-openjdk.x86_64/bin/java
[root@sandbox ~]# ls -l /usr/bin/java
lrwxrwxrwx 1 root root 22 2014-12-16 18:25 /usr/bin/java -> /etc/alternatives/java
[root@sandbox ~]# ls -l /etc/alternatives/java
lrwxrwxrwx 1 root root 46 2014-12-16 18:25 /etc/alternatives/java -> /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java
Check the java home path is set properly
vi /etc/hadoop/conf/hadoop-env.sh
Run the simple pi mapreduce job
#yarn jar /usr/hdp/2.3.2.0-2950/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 1 1
Created 02-26-2016 09:03 PM
@rbalam, if above solution not work then check the yarn-site.xml is set yarn.application.classpath properly. (required lib directory should be exists.
Created 02-26-2016 09:06 PM
@rbalam, third steps should be check the permission of that user to read the classpath directory and hdfs folder
Created 02-04-2020 09:17 AM
Hello
Please how did u add users?
Actually i am using the active directory users and I just add them into Edge node using samba + kerberos
Now I have enabled kerberos on the hadoop hortonworks cluster => I got the same issue as yours
So may I add the same user to all nodes? adduser? which group? how could it be resolved as an AD user?
Thanks
Created 02-04-2020 04:02 PM
One of our members posted a reply on how to add users in the thread you posted a similar question to later the same day.
As this is an older thread which was previously marked 'Solved', you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity for you to provide details specific to your environment about what you did in an attempt to add the relevant user accounts that could aid others in providing a more relevant, accurate answer to your question.
Created 09-08-2016 11:42 AM
I faced the same issue for Kerberos environment. It got resolved after I created the user on all the nodes.
Created 02-26-2016 11:19 PM
jobs are running fine after i added the user to hadoop group on all the nodes .. but i am not sure adding the user account to the hadoop group would be a good idea ..
Created 02-10-2017 06:24 AM
Hi,
I encountered a similar mistake, running spark, the user can not find!Please help me, thank you!
spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn-client \ --executor-memory 1G \ --num-executors 1 \ --num-executors 2 \ --driver-memory 1g \ --executor-cores 1 \ --principal kadmin/admin@NGAA.COM \ --keytab /home/test/sparktest/princpal/sparkjob.keytab \ /opt/cloudera/parcels/CDH/lib/spark/lib/spark-examples.jar 12
error messages:
17/02/10 13:54:16 INFO security.UserGroupInformation: Login successful for user kadmin/admin@NGAA.COM using keytab file /home/test/sparktest/princpal/sparkjob.keytab 17/02/10 13:54:16 INFO spark.SparkContext: Running Spark version 1.6.0 17/02/10 13:54:16 INFO spark.SecurityManager: Changing view acls to: root,kadmin 17/02/10 13:54:16 INFO spark.SecurityManager: Changing modify acls to: root,kadmin 17/02/10 13:54:16 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, kadmin); users with modify permissions: Set(root, kadmin) 17/02/10 13:54:17 INFO util.Utils: Successfully started service 'sparkDriver' on port 56214. 17/02/10 13:54:17 INFO slf4j.Slf4jLogger: Slf4jLogger started 17/02/10 13:54:17 INFO Remoting: Starting remoting 17/02/10 13:54:18 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.10.100.51:40936] 17/02/10 13:54:18 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem@10.10.100.51:40936] 17/02/10 13:54:18 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 40936. 17/02/10 13:54:18 INFO spark.SparkEnv: Registering MapOutputTracker 17/02/10 13:54:18 INFO spark.SparkEnv: Registering BlockManagerMaster 17/02/10 13:54:18 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-cf37cdde-4eab-4804-b84b-b5f937828aa7 17/02/10 13:54:18 INFO storage.MemoryStore: MemoryStore started with capacity 530.3 MB 17/02/10 13:54:18 INFO spark.SparkEnv: Registering OutputCommitCoordinator 17/02/10 13:54:19 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 17/02/10 13:54:19 INFO ui.SparkUI: Started SparkUI at http://10.10.100.51:4040 17/02/10 13:54:19 INFO spark.SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/lib/spark/lib/spark-examples.jar at spark://10.10.100.51:56214/jars/spark-examples.jar with timestamp 1486706059601 17/02/10 13:54:19 INFO yarn.Client: Attempting to login to the Kerberos using principal: kadmin/admin@NGAA.COM and keytab: /home/test/sparktest/princpal/sparkjob.keytab 17/02/10 13:54:19 INFO client.RMProxy: Connecting to ResourceManager at hadoop1/10.10.100.51:8032 17/02/10 13:54:20 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers 17/02/10 13:54:20 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 17/02/10 13:54:20 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 17/02/10 13:54:20 INFO yarn.Client: Setting up container launch context for our AM 17/02/10 13:54:20 INFO yarn.Client: Setting up the launch environment for our AM container 17/02/10 13:54:21 INFO yarn.Client: Credentials file set to: credentials-79afe260-414b-4df7-8242-3cd1a279dbc7 17/02/10 13:54:21 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: hdfs://hadoop2:8020/user/kadmin/.sparkStaging/application_1486705141135_0002 17/02/10 13:54:21 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 44 for kadmin on 10.10.100.52:8020 17/02/10 13:54:21 INFO yarn.Client: Renewal Interval set to 86400061 17/02/10 13:54:21 INFO yarn.Client: Preparing resources for our AM container 17/02/10 13:54:21 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: hdfs://hadoop2:8020/user/kadmin/.sparkStaging/application_1486705141135_0002 17/02/10 13:54:21 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 45 for kadmin on 10.10.100.52:8020 17/02/10 13:54:22 INFO hive.metastore: Trying to connect to metastore with URI thrift://hadoop1:9083 17/02/10 13:54:22 INFO hive.metastore: Opened a connection to metastore, current connections: 1 17/02/10 13:54:22 INFO hive.metastore: Connected to metastore. 17/02/10 13:54:22 INFO hive.metastore: Closed a connection to metastore, current connections: 0 17/02/10 13:54:23 INFO yarn.Client: To enable the AM to login from keytab, credentials are being copied over to the AM via the YARN Secure Distributed Cache. 17/02/10 13:54:23 INFO yarn.Client: Uploading resource file:/home/test/sparktest/princpal/sparkjob.keytab -> hdfs://hadoop2:8020/user/kadmin/.sparkStaging/application_1486705141135_0002/sparkjob.keytab 17/02/10 13:54:23 INFO yarn.Client: Uploading resource file:/tmp/spark-79d08367-6f8d-4cb3-813e-d450e90a3128/__spark_conf__4615276915023723512.zip -> hdfs://hadoop2:8020/user/kadmin/.sparkStaging/application_1486705141135_0002/__spark_conf__4615276915023723512.zip 17/02/10 13:54:23 INFO spark.SecurityManager: Changing view acls to: root,kadmin 17/02/10 13:54:23 INFO spark.SecurityManager: Changing modify acls to: root,kadmin 17/02/10 13:54:23 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root, kadmin); users with modify permissions: Set(root, kadmin) 17/02/10 13:54:23 INFO yarn.Client: Submitting application 2 to ResourceManager 17/02/10 13:54:23 INFO impl.YarnClientImpl: Submitted application application_1486705141135_0002 17/02/10 13:54:24 INFO yarn.Client: Application report for application_1486705141135_0002 (state: FAILED) 17/02/10 13:54:24 INFO yarn.Client: client token: N/A diagnostics: Application application_1486705141135_0002 failed 2 times due to AM Container for appattempt_1486705141135_0002_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://hadoop1:8088/proxy/application_1486705141135_0002/Then, click on links to logs of each attempt. Diagnostics: Application application_1486705141135_0002 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is kadmin main : requested yarn user is kadmin User kadmin not found Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: root.users.kadmin start time: 1486706063635 final status: FAILED tracking URL: http://hadoop1:8088/cluster/app/application_1486705141135_0002 user: kadmin 17/02/10 13:54:24 INFO yarn.Client: Deleting staging directory .sparkStaging/application_1486705141135_0002 17/02/10 13:54:24 ERROR spark.SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144) at org.apache.spark.SparkContext.<init>(SparkContext.scala:541) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:29) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/02/10 13:54:25 INFO ui.SparkUI: Stopped Spark web UI at http://10.10.100.51:4040 17/02/10 13:54:25 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 17/02/10 13:54:25 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down 17/02/10 13:54:25 INFO cluster.YarnClientSchedulerBackend: Stopped 17/02/10 13:54:25 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/02/10 13:54:25 ERROR util.Utils: Uncaught exception in thread main java.lang.NullPointerException at org.apache.spark.network.shuffle.ExternalShuffleClient.close(ExternalShuffleClient.java:152) at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1231) at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96) at org.apache.spark.SparkContext$$anonfun$stop$12.apply$mcV$sp(SparkContext.scala:1767) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1230) at org.apache.spark.SparkContext.stop(SparkContext.scala:1766) at org.apache.spark.SparkContext.<init>(SparkContext.scala:613) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:29) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/02/10 13:54:25 INFO spark.SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:124) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:64) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144) at org.apache.spark.SparkContext.<init>(SparkContext.scala:541) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:29) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/02/10 13:54:25 INFO storage.DiskBlockManager: Shutdown hook called 17/02/10 13:54:25 INFO util.ShutdownHookManager: Shutdown hook called 17/02/10 13:54:25 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-79d08367-6f8d-4cb3-813e-d450e90a3128/userFiles-58912a50-d060-42ec-8665-7a74c1be9a7b 17/02/10 13:54:25 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-79d08367-6f8d-4cb3-813e-d450e90a3128
Thanks
Created 11-13-2017 03:53 AM
This problem is caused by two reasons: (1) Each node did not add this ### Linux user and added it to the yarn user group. (2) nodemanager container directory permissions are not normal, this is due to the machine partition is not uniform. Solve as follows Execute on each machine ---> useradd -M ### usermod -a -G supergroup ### Finally, check each node machine node node nm directory permissions are the same!
Created 11-13-2017 03:55 AM
@rbalam Please refer to my approach.