Member since
03-14-2016
4721
Posts
1111
Kudos Received
874
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2729 | 04-27-2020 03:48 AM | |
| 5287 | 04-26-2020 06:18 PM | |
| 4456 | 04-26-2020 06:05 PM | |
| 3583 | 04-13-2020 08:53 PM | |
| 5383 | 03-31-2020 02:10 AM |
04-01-2018
12:12 AM
1 Kudo
@Sergey Sheypak Your issue looks very similar to the one mentioned in the SupportKB article: https://community.hortonworks.com/content/supportkb/178800/errorclass-orgapachehadoopyarnclientrequesthedging.html So can you please try changing the following and then see if it works: On the Client Side code updating the yarn-site.xml to use the "org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider" class instead of "org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider" : <property>
<name>yarn.client.failover-proxy-provider</name>
<value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
</property><br> . Or the other option will be to explicitly download the jar which contains class "org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider" and add it to the client classpath.
... View more
03-28-2018
05:40 AM
1 Kudo
@Saravana V As we do not have the complete stacktrace of the error hence i am assuming that the last line of the error is : File "/var/lib/ambari-agent/cache/common-services/SPARK2/2.0.0/package/scripts/spark_service.py", line 44, in make_tarfile
os.chmod(parent_dir, 0711) Which indicates that when ambari is trying to start the Spark2 history server then at the time when it is failing.. it is trying to change the directory "/tmp/spark2" permission to 0711 , So i will suggest you to first check if that directory exist on the Spark2 History server host? Please try the following two approaches to see if it works: 1. Try Creating it on your own like following: # mkdir /tmp/spark2
# chmod -R 711 /tmp/spark2
# ls -ld /tmp/spark2
drwx--x--x. 2 root root 43 Mar 28 23:15 /tmp/spark2 1. If the directory already exist there then try to remove it and then start the component. . Please let us know if you are running the ambari agent on that host as "Non Root" user by any chance? If yes then have you followed the steps mentioned in https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.1.5/bk_ambari-security/content/how_to_configure_an_ambari_agent_for_non-root.html
... View more
03-26-2018
12:42 AM
@Cathy Liu If this resolved your query then please mark this HCC thread as answered by clicking on "Accept" link on the correct answer, That way it will help other HCC users to quickly find the answers.
... View more
03-25-2018
11:19 PM
2 Kudos
@Cathy Liu Do you have enough free memory on your sandbox/Virtual Box? # free -m . If not then you might want to stop some of the services that you are not currently using like Zeppelin, Oozie, HBase and then see if it works without hanging.
... View more
03-25-2018
11:53 AM
1 Kudo
@Michael Bronson Looks like the problem is with your hostname. "master01.bx_hhtyr8" Can you please try a hostname which does not have Underscore? Please see the "Restrictions on valid hostnames" section from: https://en.wikipedia.org/wiki/Hostname link says While a hostname may not contain other characters, such as the underscore character (_), other DNS names may contain the underscore. .
... View more
03-22-2018
08:48 AM
1 Kudo
@Saravana V Can you please elaborate more on which changes are not getting reflected? How are you verifying that the changes are not getting reflected? Do you see an icon in the ambari UI "Restart Required" (a small yellow refresh icon) on the components on which you made the configuration changes? Have you restart those components so that the config changes will be pushed to the host in which the components are running?
... View more
03-21-2018
11:15 PM
@Dharmesh Jain Thanks for sharing the findings. It was really a very keen observation. Wonderful !!! I have incorporated the changes as you suggested above.
... View more
03-20-2018
11:41 AM
@Vinay K So if your NameNode shows the LastCheckpoint time is around "Mon Mar 19 12:02:33 UTC 2018" then ambari might be showing right alert "Last Checkpoint: [22 hours, 19 minutes, 45507 transactions]" So you should check from NameNode side if the check pointing is not happening on regular interval. Also please check the following property value and the NameNoide log to see any check pointing related warning / errors. dfs.namenode.checkpoint.period
Specifies the number of seconds between two periodic checkpoints.
... View more
03-20-2018
11:38 AM
@Vinay K In your epoch time command please remove 3 last digith to get accurate date: # date -d @1521460953
Mon Mar 19 12:02:33 UTC 2018
.
... View more
03-20-2018
11:04 AM
@Vinay K Ambari basically relies on the NameNode JMX call to find out the "LastCheckpointTime" Something like this: https://github.com/apache/ambari/blob/trunk/ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_checkpoint_time.py#L172 # curl "http://hdfcluster1.example.com:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem" | grep 'LastCheckpointTime' . For example if the above JMX call returns the epoch time as '1521523640579' then please convert it to the human readable time to find out what is correct time when the LastCheckPoint happened on nameNode. # date -d @1521523640 NOTE-1: if your Ambari Cluster Hosts are not time sync then it might happen that the last checkpoint computation might go wrong. NOTE-2: Every cluster node (including Ambari Server Host) should be able to resolve the NameNode JMX url. Else if the call will be made from any particular host where the alert is executed then it might not be able to make the jmx call to NN and it might give unknown results.
... View more