Member since
08-02-2018
14
Posts
2
Kudos Received
0
Solutions
08-27-2018
10:14 PM
2 Kudos
Hi @Jay Kumar SenSharma, These instructions didn't work for me when I tried it on my cluster with HDP 3.0 and Zeppelin 0.8.0. I noticed in my initial "Advanced zeppelin-shiro-ini", the passwords are encrypted strings, like user3 = $shiro1$SHA-256$500000$nf0GzH10GbYVoxa7DOlOSw==$ov/IA5W8mRWPwvAoBjNYxg3udJK0EmrVMvFCwcr9eAs=, role2<br> Then if I add a new user like this newuser = newuserpassword
Or like this newuser = newuserpassword, newrole
None of them worked. Am I missing something in the settings? To clarify, my purpose is to add a new Zeppelin user named `newuser`. Thanks! === Update === I found there is another line in my "Advanced zeppelin-shiro-ini" section `[main]` that says ## To be commented out when not using [user] block / paintext
passwordMatcher = org.apache.shiro.authc.credential.PasswordMatcher
iniRealm.credentialsMatcher = $passwordMatcher And per Apache Shiro Configuration, that string starting with `$shiro` is a hash of the password. I commented out the two lines shown above and passwords stored in plain text in "Advanced zeppelin-shiro-ini" are ok now.
... View more
08-23-2018
05:16 PM
Hey thanks Felix. I figured out it was actually neither Spark nor the firewall. It was due to an extra network adapter created by VirtualBox.
... View more
08-22-2018
06:36 PM
I figured out what went wrong... It actually had nothing to do with Spark or Windows Firewall, but with VirtualBox. My Windows machine has a VirtualBox installed, and hosts a guest VM. VirtualBox creates a network adapter called something like "VirtualBox Host-Only Network", which has a different IP address than the actual network adapter. In my case, the actual network adapter is a LAN with IP address 10.100.1.61, and the VirtualBox Host-Only Network has an IP address 192.168.56.1. I solved the issue by disabling the VirtualBox Host-Only Network in Control Panel >> Network and Internet >> Network Connections. I found this by first running `pyspark` in PowerShell, then run `netstat -an | Select-String 50000`, and saw someone listening on 192.168.56.1:50000 PS > netstat -an | sls 50000
TCP 192.168.56.1:50000 0.0.0.0:0 LISTENING
... View more
08-22-2018
02:39 AM
I have a HDP cluster of version HDP 3.0.0.0. Machines in the cluster are all Ubuntu 16.04 OS.
I want to make a Windows machine able to connect and run Spark on the cluster.
So far I've managed to make Spark submit jobs to the cluster via `spark-submit --deploy-mode cluster --master yarn`.
I'm having trouble running `pyspark` interactive shell with `--deploy-mode client`, which, to my understanding, will create a driver process running on the Windows machine. Right now when I run `pyspark` in a Windows command line console (specifically, I use PowerShell), it always fails with the following outputs:
PS > pyspark --name pysparkTest8
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:19:22) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2018-08-21 18:27:10 WARN DomainSocketFactory:117 - The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows.
2018-08-21 18:40:48 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext. (SparkContext.scala:500)
at org.apache.spark.api.java.JavaSparkContext. (JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
2018-08-21 18:40:48 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Attempted to request executors before the AM has registered!
2018-08-21 18:40:48 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running
2018-08-21 18:40:48 WARN SparkContext:66 - Another SparkContext is being constructed (or threw an exception in its constructor).
This may indicate an error, since only one SparkContext may be running in this JVM (see SPARK-2243). The other SparkContext was created at:
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:423)
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
py4j.Gateway.invoke(Gateway.java:238)
py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
py4j.GatewayConnection.run(GatewayConnection.java:238)
java.lang.Thread.run(Thread.java:748)
2018-08-21 18:54:07 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext. (SparkContext.scala:500)
at org.apache.spark.api.java.JavaSparkContext. (JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
2018-08-21 18:54:07 WARN YarnSchedulerBackend$YarnSchedulerEndpoint:66 - Attempted to request executors before the AM has registered!
2018-08-21 18:54:07 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running
Traceback (most recent call last):
File "C:\\python\pyspark\shell.py", line 54, in
spark = SparkSession.builder.getOrCreate()
File "C:\\python\pyspark\sql\session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "C:\\python\pyspark\context.py", line 343, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "C:\\python\pyspark\context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "C:\\python\pyspark\context.py", line 180, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "C:\\python\pyspark\context.py", line 282, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "C:\\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in _
_call__
File "C:\\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_re
turn_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:89)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
at org.apache.spark.SparkContext. (SparkContext.scala:500)
at org.apache.spark.api.java.JavaSparkContext. (JavaSparkContext.scala:58)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
When I look at the YARN application logs, there's something worth noting in stderr:
Log Type: stderr
Log Upload Time: Tue Aug 21 18:50:14 -0700 2018
Log Length: 3774
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/11/spark2-hdp-yarn-archive.tar.gz/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.0.0.0-1634/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/08/21 18:36:41 INFO util.SignalUtils: Registered signal handler for TERM
18/08/21 18:36:41 INFO util.SignalUtils: Registered signal handler for HUP
18/08/21 18:36:41 INFO util.SignalUtils: Registered signal handler for INT
18/08/21 18:36:41 INFO spark.SecurityManager: Changing view acls to: yarn,myusername
18/08/21 18:36:41 INFO spark.SecurityManager: Changing modify acls to: yarn,myusername
18/08/21 18:36:41 INFO spark.SecurityManager: Changing view acls groups to:
18/08/21 18:36:41 INFO spark.SecurityManager: Changing modify acls groups to:
18/08/21 18:36:41 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, myusername); groups with view permissions: Set(); users with modify permissions: Set(yarn, myusername); groups with modify permissions: Set()
18/08/21 18:36:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/08/21 18:36:42 INFO yarn.ApplicationMaster: Preparing Local resources
18/08/21 18:36:43 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
18/08/21 18:36:43 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1534303777268_0044_000001
18/08/21 18:36:44 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
18/08/21 18:38:51 ERROR yarn.ApplicationMaster: Failed to connect to driver at Windows-client-hostname:50000, retrying ...
18/08/21 18:38:51 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Failed to connect to driver!
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:672)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:532)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
18/08/21 18:38:51 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
18/08/21 18:38:51 INFO util.ShutdownHookManager: Shutdown hook called
My suspect is that the Windows client machine's firewall is blocking port 50000, because if I run telnet from one of the Ubuntu machines, I get "Connection timed out"
telnet windows-client-hostname 50000
Trying 10.100.1.61...
telnet: Unable to connect to remote host: Connection timed out
But I have specifically allowed ports 1025-65535 in Inbound Rules in Windows Firewall with Advanced Security (my Windows is Windows Server 2012 R2).
I have configured `spark.port.maxRetries` as suggested in
this post, but it didn't change anything. My `spark-defaults.conf` on the Windows client machine looks like this:
spark.master yarn
spark.yarn.am.memory 4g
spark.executor.memory 5g
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.maxResultSize 10g
spark.driver.memory 5g
spark.yarn.archive hdfs:///hdp/apps/3.0.0.0-1634/spark2/spark2-hdp-yarn-archive.tar.gz
spark.port.maxRetries 100
spark.driver.port 50000
At this point I am totally confused. Can someone give some hints on how to tackle this?
Thank you very much!
... View more
Labels:
- Labels:
-
Apache Spark
08-07-2018
05:06 AM
Thanks @Akhil S Naik.
Unfortunately I don't see a `errors-2610.txt` or `output-2610` file on the Ambari server machine in `/var/lib/ambari-agent/data` directory. There are many other errors-xxxx.txt files but not `-2610`...
But `/var/log/ambari-server/ambari-server.log` has something related to 2610:
cat ambari-server.log | grep -A 5 2610
2018-08-06 15:48:25,876 ERROR [ambari-action-scheduler] ActionScheduler:817 - Execution command has no timeout parameter{"clusterName":"citilabs_test_cluster","requestId":192,"stageId":-1,"taskId":2610,"commandId":"192--1","hostname":"_internal_ambari","role":"AMBARI_SERVER_ACTION","hostLevelParams":{},"roleParams":{"ACTION_USER_NAME":"ambari","ACTION_NAME":"org.apache.ambari.server.serveraction.users.PostUserCreationHookServerAction"},"roleCommand":"EXECUTE","clusterHostInfo":{},"configurations":{},"configurationAttributes":{},"configurationTags":{},"forceRefreshConfigTagsBeforeExecution":false,"commandParams":{"cmd-hdfs-principal":"NA","cmd-input-file":"/var/lib/ambari-server/data/tmp/user_hook_input_1533595705841.csv","cluster-security-type":"NONE","cmd-hdfs-user":"hdfs","cmd-payload":"{\"guozhen\":[]}","cmd-hdfs-keytab":"NA","hook-script":"/var/lib/ambari-server/resources/sripts/post-user-creation-hook.sh","cluster-name":"citilabs_test_cluster","cluster-id":"2"},"serviceName":"","kerberosCommandParams":[],"localComponents":[],"availableServices":{},"componentVersionMap":{"HIVE":{"HIVE_SERVER":"3.0.0.0-1634","HIVE_SERVER_INTERACTIVE":"3.0.0.0-1634","HIVE_METASTORE":"3.0.0.0-1634","HIVE_CLIENT":"3.0.0.0-1634"},"ZEPPELIN":{"ZEPPELIN_MASTER":"3.0.0.0-1634"},"SQOOP":{"SQOOP":"3.0.0.0-1634"},"HDFS":{"SECONDARY_NAMENODE":"3.0.0.0-1634","HDFS_CLIENT":"3.0.0.0-1634","ZKFC":"3.0.0.0-1634","NFS_GATEWAY":"3.0.0.0-1634","DATANODE":"3.0.0.0-1634","JOURNALNODE":"3.0.0.0-1634","NAMENODE":"3.0.0.0-1634"},"MAPREDUCE2":{"MAPREDUCE2_CLIENT":"3.0.0.0-1634","HISTORYSERVER":"3.0.0.0-1634"},"OOZIE":{"OOZIE_CLIENT":"3.0.0.0-1634","OOZIE_SERVER":"3.0.0.0-1634"},"TEZ":{"TEZ_CLIENT":"3.0.0.0-1634"},"ZOOKEEPER":{"ZOOKEEPER_SERVER":"3.0.0.0-1634","ZOOKEEPER_CLIENT":"3.0.0.0-1634"},"SPARK2":{"SPARK2_CLIENT":"3.0.0.0-1634","SPARK2_THRIFTSERVER":"3.0.0.0-1634","LIVY2_SERVER":"3.0.0.0-1634","SPARK2_JOBHISTORYSERVER":"3.0.0.0-1634"},"YARN":{"TIMELINE_READER":"3.0.0.0-1634","NODEMANAGER":"3.0.0.0-1634","YARN_CLIENT":"3.0.0.0-1634","APP_TIMELINE_SERVER":"3.0.0.0-1634","YARN_REGISTRY_DNS":"3.0.0.0-1634","RESOURCEMANAGER":"3.0.0.0-1634"}},"commandType":"EXECUTION_COMMAND"}
2018-08-06 15:48:25,917 INFO [Server Action Executor Worker 2610] PostUserCreationHookServerAction:131 - Validating command parameters ...
2018-08-06 15:48:25,917 INFO [Server Action Executor Worker 2610] PostUserCreationHookServerAction:158 - Command parameter validation passed.
2018-08-06 15:48:25,919 INFO [Server Action Executor Worker 2610] CsvFilePersisterService:106 - Persisting map data to csv file
2018-08-06 15:48:25,919 INFO [Server Action Executor Worker 2610] CsvFilePersisterService:82 - Persisting collection to csv file
2018-08-06 15:48:25,919 INFO [Server Action Executor Worker 2610] CsvFilePersisterService:86 - Collection successfully persisted to csv file.
2018-08-06 15:48:25,919 INFO [Server Action Executor Worker 2610] ShellCommandUtilityWrapper:48 - Running command: /var/lib/ambari-server/resources/sripts/post-user-creation-hook.sh
2018-08-06 15:48:25,923 ERROR [Server Action Executor Worker 2610] PostUserCreationHookServerAction:93 - Server action is about to quit due to an exception.
java.io.IOException: Cannot run program "/var/lib/ambari-server/resources/sripts/post-user-creation-hook.sh": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.ambari.server.utils.ShellCommandUtil.runCommand(ShellCommandUtil.java:457)
at org.apache.ambari.server.utils.ShellCommandUtil.runCommand(ShellCommandUtil.java:513)
at org.apache.ambari.server.utils.ShellCommandUtil.runCommand(ShellCommandUtil.java:526)
--
2018-08-06 15:48:25,924 WARN [Server Action Executor Worker 2610] ServerActionExecutor:471 - Task #2610 failed to complete execution due to thrown exception: org.apache.ambari.server.AmbariException:Server action execution failed to complete!
org.apache.ambari.server.AmbariException: Server action execution failed to complete!
at org.apache.ambari.server.serveraction.users.PostUserCreationHookServerAction.execute(PostUserCreationHookServerAction.java:94)
at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.execute(ServerActionExecutor.java:550)
at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.run(ServerActionExecutor.java:466)
at java.lang.Thread.run(Thread.java:745)
It's quite obvious now... the `ERROR` line says that `/var/lib/ambari-server/resources/sripts/post-user-creation-hook.sh` file doesn't exist. I missed a 'c' in 'scripts' in the path. I corrected it and user home directory creation worked!
Thanks for helping me out!
My HDFS config file says these:
hadoop.proxyuser.root.groups=*
hadoop.proxyuser.root.hosts=vm-097
where `vm-097` is the Ambari server hostname. Should I be worried about this?
Finally, do you mind give some advice on how to know where to track logs when problems occur? I would have no idea where to locate the error message if it weren't for your help. (Thanks again!)
... View more
08-06-2018
11:19 PM
The issue:
I followed
Administering Ambari: Enable user home directory creation to enable creating a home directory in HDFS for a user added via Ambari. However, every time I create a user, Ambari's task list shows a failure log titled "Post user creation hook for [1] users", and the content says:
stderr: errors-2610.txt
Server action execution failed to complete!
stdout: output-2610.txt
Server action failed
My understanding is that this is because the `admin` user has no permission to modify content of HDFS directory `/user`.
`hdfs dfs -ls` commands show the following:
$ hdfs dfs -ls /
Found 13 items
drwxrwxrwt - yarn hadoop 0 2018-08-06 15:04 /app-logs
drwxr-xr-x - hdfs hdfs 0 2018-08-05 21:05 /apps
drwxr-xr-x - yarn hadoop 0 2018-08-02 15:34 /ats
drwxr-xr-x - hdfs hdfs 0 2018-08-02 15:34 /atsv2
drwxr-xr-x - hdfs hdfs 0 2018-08-02 15:34 /hdp
drwx------ - livy hdfs 0 2018-08-02 15:50 /livy2-recovery
drwxr-xr-x - mapred hdfs 0 2018-08-02 15:34 /mapred
drwxrwxrwx - mapred hadoop 0 2018-08-02 15:35 /mr-history
drwxr-xr-x - hdfs hdfs 0 2018-08-02 15:34 /services
drwxrwxrwx - spark hadoop 0 2018-08-06 16:07 /spark2-history
drwxrwxrwx - hdfs hdfs 0 2018-08-05 20:53 /tmp
drwxr-xr-x - hdfs hdfs 0 2018-08-06 15:46 /user
drwxr-xr-x - hdfs hdfs 0 2018-08-03 00:25 /warehouse
$ hdfs dfs -ls /user
Found 8 items
drwxr-xr-x - admin hdfs 0 2018-08-06 15:46 /user/admin
drwxrwx--- - ambari-qa hdfs 0 2018-08-05 21:00 /user/ambari-qa
drwxr-xr-x - hive hdfs 0 2018-08-03 19:51 /user/hive
drwxrwxr-x - livy hdfs 0 2018-08-02 15:50 /user/livy
drwxrwxr-x - oozie hdfs 0 2018-08-05 20:54 /user/oozie
drwxrwxr-x - spark hdfs 0 2018-08-02 15:50 /user/spark
drwxrwx--- - yarn-ats hadoop 0 2018-08-03 19:46 /user/yarn-ats
drwxr-xr-x - zeppelin hdfs 0 2018-08-05 21:06 /user/zeppelin
My questions are:
Is it due to the permission problem that Ambari failed to run `post-user-creation-hook.sh`?
If yes, how do I give enough permission to the `admin` user?
If not, what might be causing the failure?
Thanks!
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
08-04-2018
05:02 AM
To answer my own question: I resolved this issue by enabling ResourceManager High Availability (HA). The steps are described here: https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.0.0/managing-high-availability/content/amb_enable_resourcemanager_high_availability.html I do not understand why it works, though. Hope someone can explain more.
... View more
08-03-2018
04:49 PM
I did put the hive database on a different host. My ambari-server is running on `vm-097`, and the hive database on `vm-100`. I am using Postgres (version 10) for hive database though. I did the creating database and granting all privileges too in the postgres database. The "test connection" in Ambari says this connection is ok. The hive database looks like this:
hive@vm-100:~$ psql
psql (10.4 (Ubuntu 10.4-2.pgdg16.04+1))
Type "help" for help.
hive=> \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
hive | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres +
| | | | | postgres=CTc/postgres+
| | | | | hive=CTc/postgres
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres=CTc/postgres+
| | | | | =c/postgres
(4 rows)
hive=> \c hive
You are now connected to database "hive" as user "hive".
hive=> \dt *
List of relations
Schema | Name | Type | Owner
------------+-------------------------+-------+----------
pg_catalog | pg_aggregate | table | postgres
pg_catalog | pg_am | table | postgres
pg_catalog | pg_amop | table | postgres
pg_catalog | pg_amproc | table | postgres
pg_catalog | pg_attrdef | table | postgres
pg_catalog | pg_attribute | table | postgres
pg_catalog | pg_auth_members | table | postgres
pg_catalog | pg_authid | table | postgres
pg_catalog | pg_cast | table | postgres
pg_catalog | pg_class | table | postgres
pg_catalog | pg_collation | table | postgres
pg_catalog | pg_constraint | table | postgres
pg_catalog | pg_conversion | table | postgres
pg_catalog | pg_database | table | postgres
pg_catalog | pg_db_role_setting | table | postgres
pg_catalog | pg_default_acl | table | postgres
pg_catalog | pg_depend | table | postgres
pg_catalog | pg_description | table | postgres
pg_catalog | pg_enum | table | postgres
pg_catalog | pg_event_trigger | table | postgres
pg_catalog | pg_extension | table | postgres
pg_catalog | pg_foreign_data_wrapper | table | postgres
pg_catalog | pg_foreign_server | table | postgres
pg_catalog | pg_foreign_table | table | postgres
pg_catalog | pg_index | table | postgres
pg_catalog | pg_inherits | table | postgres
pg_catalog | pg_init_privs | table | postgres
pg_catalog | pg_language | table | postgres
pg_catalog | pg_largeobject | table | postgres
pg_catalog | pg_largeobject_metadata | table | postgres
pg_catalog | pg_namespace | table | postgres
pg_catalog | pg_opclass | table | postgres
pg_catalog | pg_operator | table | postgres
pg_catalog | pg_opfamily | table | postgres
pg_catalog | pg_partitioned_table | table | postgres
pg_catalog | pg_pltemplate | table | postgres
pg_catalog | pg_policy | table | postgres
pg_catalog | pg_proc | table | postgres
pg_catalog | pg_publication | table | postgres
pg_catalog | pg_publication_rel | table | postgres
pg_catalog | pg_range | table | postgres
pg_catalog | pg_replication_origin | table | postgres
pg_catalog | pg_rewrite | table | postgres
pg_catalog | pg_seclabel | table | postgres
pg_catalog | pg_sequence | table | postgres
pg_catalog | pg_shdepend | table | postgres
pg_catalog | pg_shdescription | table | postgres
pg_catalog | pg_shseclabel | table | postgres
pg_catalog | pg_statistic | table | postgres
pg_catalog | pg_statistic_ext | table | postgres
pg_catalog | pg_subscription | table | postgres
pg_catalog | pg_subscription_rel | table | postgres
pg_catalog | pg_tablespace | table | postgres
pg_catalog | pg_transform | table | postgres
pg_catalog | pg_trigger | table | postgres
pg_catalog | pg_ts_config | table | postgres
pg_catalog | pg_ts_config_map | table | postgres
pg_catalog | pg_ts_dict | table | postgres
pg_catalog | pg_ts_parser | table | postgres
pg_catalog | pg_ts_template | table | postgres
pg_catalog | pg_type | table | postgres
pg_catalog | pg_user_mapping | table | postgres
(62 rows)
... View more
08-03-2018
07:07 AM
And here is a blueprint.json
... View more
08-03-2018
07:03 AM
My HDP version is HDP-3.0.0.0 (3.0.0.0-1634)
Ambari Version is 2.7.0.0
I don't have rpm on Ununtu, but `dpkg` outputs are:
yarn@vm-097:~$ dpkg -l | grep -i ambari
ii ambari-agent 2.7.0.0-897 amd64 Ambari Agent
ii ambari-infra-solr 2.7.0.0-897 amd64 [[description]]
ii ambari-infra-solr-client 2.7.0.0-897 amd64 [[description]]
ii ambari-metrics-assembly 2.7.0.0-897 amd64 Ambari Metrics Assembly
ii ambari-server 2.7.0.0-897 amd64 Ambari Server
yarn@vm-097:~$ dpkg -l | grep -i hdfs
ii hadoop-3-0-0-0-1634-hdfs 3.1.0.3.0.0.0-1634 all The Hadoop Distributed File System
ii hadoop-3-0-0-0-1634-hdfs-datanode 3.1.0.3.0.0.0-1634 all Hadoop Data Node
ii hadoop-3-0-0-0-1634-hdfs-journalnode 3.1.0.3.0.0.0-1634 all Hadoop HDFS JournalNode
ii hadoop-3-0-0-0-1634-hdfs-namenode 3.1.0.3.0.0.0-1634 all The Hadoop namenode manages the block locations of HDFS files
ii hadoop-3-0-0-0-1634-hdfs-secondarynamenode 3.1.0.3.0.0.0-1634 all Hadoop Secondary namenode
ii hadoop-3-0-0-0-1634-hdfs-zkfc 3.1.0.3.0.0.0-1634 all Hadoop HDFS failover controller
ii libhdfs0-3-0-0-0-1634 3.1.0.3.0.0.0-1634 amd64 Hadoop Filesystem Library
ii ranger-3-0-0-0-1634-hdfs-plugin 1.1.0.3.0.0.0-1634 all Ranger HDFS plugin component runs within namenode to provoide enterprise security using ranger framework
ii sqoop-3-0-0-0-1634 1.4.7.3.0.0.0-1634 all Sqoop allows easy imports and exports of data sets between databases and the Hadoop Distributed File System (HDFS).
yarn@vm-097:~$ dpkg -l | grep -i yarn
ii atlas-metadata-3-0-0-0-1634 1.0.0.3.0.0.0-1634 all Atlas is an application framework which allows for a complex directed-acyclic-graph of tasks for processing data and is built atop Apache Hadoop YARN.
ii hadoop-3-0-0-0-1634-yarn 3.1.0.3.0.0.0-1634 all The Hadoop NextGen MapReduce (YARN)
ii livy2-3-0-0-0-1634 0.5.0.3.0.0.0-1634 all Livy is an open source REST interface for interacting with Spark2 from anywhere. It supports executing snippets of code or programs in a Spark2 context that runs locally or in YARN.
ii ranger-3-0-0-0-1634-yarn-plugin 1.1.0.3.0.0.0-1634 all Ranger yarn plugin component runs within namenode to provide enterprise security using ranger framework
ii spark2-3-0-0-0-1634-yarn-shuffle 2.3.1.3.0.0.0-1634 all Spark Yarn Shuffle jar
yarn@vm-097:~$ dpkg -l | grep -i hive
ii atlas-metadata-3-0-0-0-1634-hive-plugin 1.0.0.3.0.0.0-1634 all Atlas Hive plugin component runs with hive using HIVE_AUX_JARS_PATH=/hook/hive
ii cpio 2.11+dfsg-5ubuntu1 amd64 GNU cpio -- a program to manage archives of files
ii hive-3-0-0-0-1634 3.1.0.3.0.0.0-1634 all Hive is a data warehouse infrastructure built on top of Hadoop
ii hive-3-0-0-0-1634-hcatalog 3.1.0.3.0.0.0-1634 all Apache Hcatalog is a data warehouse infrastructure built on top of Hadoop
ii hive-3-0-0-0-1634-jdbc 3.1.0.3.0.0.0-1634 all Provides libraries necessary to connect to Apache Hive via JDBC
ii hive-warehouse-connector-3-0-0-0-1634 1.0.0.3.0.0.0-1634 all A library to load data into Apache Spark™ SQL DataFrames from
ii oozie-3-0-0-0-1634-sharelib-hive 4.3.1.3.0.0.0-1634 all hive shared libraries for oozie workflow engine
ii oozie-3-0-0-0-1634-sharelib-hive2 4.3.1.3.0.0.0-1634 all hive2 shared libraries for oozie workflow engine
ii ranger-3-0-0-0-1634-hive-plugin 1.1.0.3.0.0.0-1634 all Ranger Hive plugin component runs within hiveserver2 to provoide enterprise security using ranger framework
ii ubuntu-keyring 2012.05.19 all GnuPG keys of the Ubuntu archive
ii unzip 6.0-20ubuntu1 amd64 De-archiver for .zip files
ii zip 3.0-11 amd64 Archiver for .zip files
... View more
08-03-2018
03:48 AM
Thanks @Ravi for your replies.
Here are the outputs from machine 10.100.1.161:
yarn@vm-097:~$ netstat -plan | grep 8141
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:8141 0.0.0.0:* LISTEN 28581/java
yarn@vm-097:~$ ps -ef | grep resourcemanager
yarn 24639 24510 0 20:37 pts/0 00:00:00 grep --color=auto resourcemanager
yarn 28581 1 1 17:19 ? 00:02:07 /usr/jdk64/jdk1.8.0_112/bin/java -Dproc_resourcemanager -Dhdp.version=3.0.0.0-1634 -Djava.net.preferIPv4Stack=true -Dhdp.version=3.0.0.0-1634 -Dyarn.id.str= -Dyarn.policy.file=hadoop-policy.xml -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir -Dservice.libdir=/usr/hdp/3.0.0.0-1634/hadoop-yarn/./,/usr/hdp/3.0.0.0-1634/hadoop-yarn/lib,/usr/hdp/3.0.0.0-1634/hadoop-hdfs/./,/usr/hdp/3.0.0.0-1634/hadoop-hdfs/lib,/usr/hdp/3.0.0.0-1634/hadoop/./,/usr/hdp/3.0.0.0-1634/hadoop/lib -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Drm.audit.logger=INFO,RMAUDIT -Dyarn.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.file=hadoop-yarn-resourcemanager-vm-097.log -Dyarn.home.dir=/usr/hdp/3.0.0.0-1634/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=:/usr/hdp/3.0.0.0-1634/hadoop/lib/native/Linux-amd64-64:/usr/hdp/3.0.0.0-1634/hadoop/lib/native/Linux-amd64-64:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/3.0.0.0-1634/hadoop/lib/native -Xmx1024m -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dhadoop.log.file=hadoop-yarn-resourcemanager-vm-097.log -Dhadoop.home.dir=/usr/hdp/3.0.0.0-1634/hadoop -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
yarn@vm-097:~$ telnet 10.100.1.161 8141
Trying 10.100.1.161...
Connected to 10.100.1.161.
Escape character is '^]'.
I did not specify anything about HA during installation. According to Ambari's YARN configs web UI, `yarn.resourcemanager.ha.enabled` has value `false`. Could this be the reason?
... View more
08-03-2018
12:42 AM
I have installed these components using Ambari install:
HDFS, YARN, MapReduce2, Tez, Hive, ZooKeeper, Spark2
When I let Ambari start all services, HDFS, MapReduce2, ZooKeeper, and YARN started successfully, but the procedure is stuck at "Start Hive Metastore". Task log attached at the end, but I think the critical lines are:
2018-08-02 16:36:00,100 - Execute['yarn rmadmin -refreshSuperUserGroupsConfiguration'] {'user': 'yarn'}
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 200, in
HiveMetastore().execute()
(omitted many lines)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn rmadmin -refreshSuperUserGroupsConfiguration' returned 255. 18/08/02 16:36:01 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141
In my setting, `vm-097` runs the ResourceManager, and the Hive server is on another machine `vm-100`. Both are virtual machines running Ubuntu 16.04 on Windows hosts. I went to `vm-100` and ran the `yarn rmadmin -refreshSuperUserGroupsConfiguration`, it shows similar errors
yarn@vm-100:~$ yarn rmadmin -refreshSuperUserGroupsConfiguration
18/08/02 17:04:56 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141
18/08/02 17:04:56 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
(many lines omitted)
I googled for `yarn rmadmin` and did not find much helpful info. Hope someone here could help.
Thanks!
Attachment 1: Ambari "Hive Metastor Stat" task log
stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 200, in
HiveMetastore().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive_metastore.py", line 55, in start
refresh_yarn()
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/scripts/hive.py", line 401, in refresh_yarn
Execute("yarn rmadmin -refreshSuperUserGroupsConfiguration", user = params.yarn_user)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
returns=self.resource.returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn rmadmin -refreshSuperUserGroupsConfiguration' returned 255. 18/08/02 16:36:01 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141
18/08/02 16:36:02 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 1 failover attempts. Trying to failover after sleeping for 20085ms.
18/08/02 16:36:22 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 2 failover attempts. Trying to failover after sleeping for 25094ms.
18/08/02 16:36:47 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 3 failover attempts. Trying to failover after sleeping for 16001ms.
18/08/02 16:37:03 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 4 failover attempts. Trying to failover after sleeping for 20361ms.
18/08/02 16:37:23 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 5 failover attempts. Trying to failover after sleeping for 31694ms.
18/08/02 16:37:55 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 6 failover attempts. Trying to failover after sleeping for 32062ms.
18/08/02 16:38:27 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 7 failover attempts. Trying to failover after sleeping for 15377ms.
18/08/02 16:38:42 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 8 failover attempts. Trying to failover after sleeping for 26500ms.
18/08/02 16:39:09 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 9 failover attempts. Trying to failover after sleeping for 26405ms.
18/08/02 16:39:35 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 10 failover attempts. Trying to failover after sleeping for 15172ms.
18/08/02 16:39:51 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 11 failover attempts. Trying to failover after sleeping for 27700ms.
18/08/02 16:40:18 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 12 failover attempts. Trying to failover after sleeping for 39587ms.
18/08/02 16:40:58 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 13 failover attempts. Trying to failover after sleeping for 19571ms.
18/08/02 16:41:18 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 14 failover attempts. Trying to failover after sleeping for 17980ms.
18/08/02 16:41:35 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 15 failover attempts. Trying to failover after sleeping for 25732ms.
18/08/02 16:42:01 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 16 failover attempts. Trying to failover after sleeping for 28892ms.
18/08/02 16:42:30 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 17 failover attempts. Trying to failover after sleeping for 32208ms.
18/08/02 16:43:02 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 18 failover attempts. Trying to failover after sleeping for 31339ms.
18/08/02 16:43:34 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 19 failover attempts. Trying to failover after sleeping for 17716ms.
18/08/02 16:43:51 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 20 failover attempts. Trying to failover after sleeping for 39465ms.
18/08/02 16:44:31 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 21 failover attempts. Trying to failover after sleeping for 27786ms.
18/08/02 16:44:59 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 22 failover attempts. Trying to failover after sleeping for 39978ms.
18/08/02 16:45:39 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 23 failover attempts. Trying to failover after sleeping for 17613ms.
18/08/02 16:45:56 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 24 failover attempts. Trying to failover after sleeping for 23792ms.
18/08/02 16:46:20 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 25 failover attempts. Trying to failover after sleeping for 16330ms.
18/08/02 16:46:36 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 26 failover attempts. Trying to failover after sleeping for 44810ms.
18/08/02 16:47:21 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 27 failover attempts. Trying to failover after sleeping for 33074ms.
18/08/02 16:47:54 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 28 failover attempts. Trying to failover after sleeping for 21553ms.
18/08/02 16:48:16 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 29 failover attempts. Trying to failover after sleeping for 41820ms.
refreshSuperUserGroupsConfiguration: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
stdout:
2018-08-02 16:35:57,445 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634
2018-08-02 16:35:57,472 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf
2018-08-02 16:35:57,817 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634
2018-08-02 16:35:57,826 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf
2018-08-02 16:35:57,828 - Group['livy'] {}
2018-08-02 16:35:57,829 - Group['spark'] {}
2018-08-02 16:35:57,830 - Group['hdfs'] {}
2018-08-02 16:35:57,830 - Group['hadoop'] {}
2018-08-02 16:35:57,830 - Group['users'] {}
2018-08-02 16:35:57,831 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-08-02 16:35:57,832 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-08-02 16:35:57,834 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['livy', 'hadoop'], 'uid': None}
2018-08-02 16:35:57,835 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-08-02 16:35:57,836 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['spark', 'hadoop'], 'uid': None}
2018-08-02 16:35:57,837 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2018-08-02 16:35:57,838 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'users'], 'uid': None}
2018-08-02 16:35:57,839 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop'], 'uid': None}
2018-08-02 16:35:57,841 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-08-02 16:35:57,842 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2018-08-02 16:35:57,843 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2018-08-02 16:35:57,845 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2018-08-02 16:35:57,854 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa 0'] due to not_if
2018-08-02 16:35:57,855 - Group['hdfs'] {}
2018-08-02 16:35:57,855 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hdfs', 'hadoop', u'hdfs']}
2018-08-02 16:35:57,856 - FS Type: HDFS
2018-08-02 16:35:57,856 - Directory['/etc/hadoop'] {'mode': 0755}
2018-08-02 16:35:57,883 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2018-08-02 16:35:57,884 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2018-08-02 16:35:57,913 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2018-08-02 16:35:57,927 - Skipping Execute[('setenforce', '0')] due to not_if
2018-08-02 16:35:57,928 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'}
2018-08-02 16:35:57,931 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'}
2018-08-02 16:35:57,932 - Changing owner for /var/run/hadoop from 1017 to root
2018-08-02 16:35:57,932 - Changing group for /var/run/hadoop from 1007 to root
2018-08-02 16:35:57,933 - Directory['/var/run/hadoop/hdfs'] {'owner': 'hdfs', 'cd_access': 'a'}
2018-08-02 16:35:57,934 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'}
2018-08-02 16:35:57,941 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2018-08-02 16:35:57,945 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2018-08-02 16:35:57,957 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/log4j.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2018-08-02 16:35:57,977 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/hadoop-metrics2.properties'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2018-08-02 16:35:57,978 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2018-08-02 16:35:57,979 - File['/usr/hdp/3.0.0.0-1634/hadoop/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2018-08-02 16:35:57,987 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop', 'mode': 0644}
2018-08-02 16:35:57,993 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2018-08-02 16:35:57,999 - Skipping unlimited key JCE policy check and setup since it is not required
2018-08-02 16:35:58,507 - Using hadoop conf dir: /usr/hdp/3.0.0.0-1634/hadoop/conf
2018-08-02 16:35:58,526 - call['ambari-python-wrap /usr/bin/hdp-select status hive-server2'] {'timeout': 20}
2018-08-02 16:35:58,567 - call returned (0, 'hive-server2 - 3.0.0.0-1634')
2018-08-02 16:35:58,569 - Stack Feature Version Info: Cluster Stack=3.0, Command Stack=None, Command Version=3.0.0.0-1634 -> 3.0.0.0-1634
2018-08-02 16:35:58,609 - File['/var/lib/ambari-agent/cred/lib/CredentialUtil.jar'] {'content': DownloadSource('http://vm-097:8080/resources/CredentialUtil.jar'), 'mode': 0755}
2018-08-02 16:35:58,611 - Not downloading the file from http://vm-097:8080/resources/CredentialUtil.jar, because /var/lib/ambari-agent/tmp/CredentialUtil.jar already exists
2018-08-02 16:36:00,100 - Execute['yarn rmadmin -refreshSuperUserGroupsConfiguration'] {'user': 'yarn'}
Command failed after 1 tries
Attachment 2: `yarn rmadmin` output on `vm-100`
yarn@vm-100:~$ yarn rmadmin -refreshSuperUserGroupsConfiguration
18/08/02 17:38:04 INFO client.RMProxy: Connecting to ResourceManager at vm-097/10.100.1.161:8141
18/08/02 17:38:05 INFO retry.RetryInvocationHandler: org.apache.hadoop.ipc.StandbyException: ResourceManager null is not Active!
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.throwStandbyException(AdminService.java:274)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkRMStatus(AdminService.java:904)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshSuperUserGroupsConfiguration(AdminService.java:485)
at org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshSuperUserGroupsConfiguration(ResourceManagerAdministrationProtocolPBServiceImpl.java:163)
at org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:275)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
, while invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshSuperUserGroupsConfiguration over null after 1 failover attempts. Trying to failover after sleeping for 19136ms.
^C
Attachment 3: `/etc/hosts` content on `vm-100`
yarn@vm-100:~$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 vm-100
# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.100.1.161 vm-097
10.100.1.162 vm-100
10.100.1.163 vm-136
10.100.1.164 vm-137
10.100.1.165 vm-138
... View more
- Tags:
- Hadoop Core
- Hive
- YARN
Labels:
- Labels:
-
Apache Hive
-
Apache YARN
08-02-2018
05:03 PM
Hi @Nigel Jones, have you found a solution to this problem yet? I am experiencing the exact same problems starting up HDP 3.0.0
... View more