Support Questions

Find answers, ask questions, and share your expertise

Ambari Service Start not resolving $USER

avatar
Explorer

Hi,

We are using HDP 2.5.0.0 on RHEL7. We had the cluster up and running for about 6 months now. We had to reboot the cluster couple of days back and noticed that the service status did not get correctly reflected on Ambari. If I do ps -ef on the box, I see that the process is running. Upon digging further I find that the $USER variable in the scripts is not getting resolved. As a result the pid files are created in the wrong directory with incorrect name. e.g /var/run/hadoop-mapreduce/mapred--historyserver.pid instead of /var/run/hadoop-mapreduce/mapred/mapred-mapred-history-server.pid. Any pointers in troubleshooting this? TIA

resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su mapred -l -s /bin/bash -c 'ls /var/run/hadoop-mapreduce/mapred/mapred-mapred-historyserver.pid && ps -p `cat /var/run/hadoop-mapreduce/mapred/mapred-mapred-historyserver.pid`'' returned 2. ls: cannot access /var/run/hadoop-mapreduce/mapred/mapred-mapred-historyserver.pid: No such file or directory
1 ACCEPTED SOLUTION

avatar
Explorer

Thanks that's helpful. We upgraded to RHEL7.3. Looks like this is not supported by HDP 2.5.3 yet.

View solution in original post

6 REPLIES 6

avatar
Guru

@Vijay Lakshman, how did you restart the mapred history server after machine reboot ? Did you start it through ambari ?

Can you also check the value of HADOOP_MAPRED_PID_DIR var in hadoop-env.sh ? ideally it should be as below.

export HADOOP_MAPRED_PID_DIR=/var/run/hadoop-mapreduce/$USER

In order to fix this issue, stop the mapred-history server and make sure /var/run/hadoop-mapreduce/mapred--historyserver.pid and /var/run/hadoop-mapreduce/mapred/mapred-mapred-history-server.pid is deleted. Then start mapred history server through ambari.

avatar
Explorer

Thanks. I am starting all the services by clicking on Start All in Ambari. I am facing the same issue with all the services.

Yes, the environment variable is using the default and not overriden.

e.g for timeline server

drwxr-xr-x 2 yarn hadoop 40 Feb 13 18:45 yarn

-rw-r--r-- 1 yarn hadoop 6 Feb 13 18:48 yarn--resourcemanager.pid

-rw-r--r-- 1 yarn hadoop 5 Feb 13 18:47 yarn--timelineserver.pid

$ ls yarn

$

avatar
Guru

@Vijay Lakshman, looks like the USER env variable got missing from the machines. Can you please check on your hosts whether $USER is set for all the users ( such as hdfs, yarn, mapred etc ). You can also use "printenv" to print all env var.

[root@xxx]# sudo su hdfs

 bash-4.2$ echo $USER

 hdfs

avatar
Explorer

Thanks that's helpful. We upgraded to RHEL7.3. Looks like this is not supported by HDP 2.5.3 yet.

avatar
Guru

Glad to know I was able to help.

avatar
Explorer

It helped us troubleshoot the issue but we could not figure out why this is happening. We went back to reinstalling a new cluster on RHEL 7.2 and that is working fine. We still need to figure why RHEL 7.3 is causing this issue. Will post an update if we ever figure out.