Member since
03-14-2016
4721
Posts
1111
Kudos Received
874
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2829 | 04-27-2020 03:48 AM | |
| 5503 | 04-26-2020 06:18 PM | |
| 4682 | 04-26-2020 06:05 PM | |
| 3712 | 04-13-2020 08:53 PM | |
| 5622 | 03-31-2020 02:10 AM |
08-21-2019
10:58 PM
1 Kudo
@Manoj690 Usually the following command execution should not be taking more than 5 seconds until there is some serious slowness on the host where this command was being executed by ambari agent. # /usr/jdk64/jdk1.8.0_112/bin/java -jar /var/lib/ambari-agent/tools/jcepolicyinfo.jar -tu
# time /usr/jdk64/jdk1.8.0_112/bin/java -jar /var/lib/ambari-agent/tools/jcepolicyinfo.jar -tu Can you please try to run the same command from the user who is running the ambari agent and then capture the output how much time it takes the execute. The second command will show the time as well. If above also responds very slow then please collect the "top" and "free -m" command output. which might give us some idea. # top
# free -m It might be a temporary slowness if any heavy job was running on the host that time so may be retry to start HST again to see if the timeout still occurs. If your host is really slow and you can not fix it then the other alternate option will be to do this. 1. Edit the following file and increase the timeout from 5 seconds to 10-15 seconds. (Default value is 5 seconds) # grep timeout /usr/lib/ambari-agent/lib/resource_management/core/resources/jcepolicyinfo.py
timeout = 5, 2. After increasing the timeout to 15 seconds remove the pyc and pyo files and then retry the operation of starting HST . (On the host where it is timing out) # rm -f /usr/lib/ambari-agent/lib/resource_management/core/resources/jcepolicyinfo.pyc
# rm -f /usr/lib/ambari-agent/lib/resource_management/core/resources/jcepolicyinfo.pyo
... View more
08-21-2019
05:59 PM
@harry_li So is your Ambari Server running fine now? Please share the ambari server log if you see the same MySQL error. Also for tested and certified version of MySQL that can be used with Ambari please refer to the following link : https://supportmatrix.hortonworks.com/ If your question is answered then, Please make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more
08-21-2019
05:16 PM
@harry_li No posting such link will not be an issue. As @denloe mentioned that you should not face any issue from now. So it should be all good. Looks like you are able to post new queries now as i see your latest post here: Ambari-server start error
... View more
08-21-2019
05:06 PM
@harry_li Looks like due to some issue Ambari is not able to connect to MySQL and hence the Connection Pool is not getting populated. Most probably it might be some issue with MySQL / Network issue (firewall..etc)/ Misconfiguration like incorrect host/port in ambari configs for MySQL host/port ...etc So can you please test the connection by running this simple command from ambari server host to see if you get a Successful connection creation response? # /usr/jdk64/jdk1.8.0_112/bin/java -cp /var/lib/ambari-server/resources/DBConnectionVerification.jar:/usr/share/java/mysql-connector-java.jar org.apache.ambari.server.DBConnectionVerification "jdbc:mysql://xxxxxxxxxxx:3306/ambari" "ambari" "bigdata" com.mysql.jdbc.Driver Please use your own MySQL username and passwords in the above query to test the connection. I am using default password as "bigdata" and DB user name as "ambari" in the above example query. If it fails to connect to mysql then Pleas echeck if at n/w level the Port access is allowed to MySQL Server from Ambari Server host? # telnet $MYSQL_FQDN 3306
(OR)
# nc -v $MYSQL_FQDN 3306 If possible then can you please try restarting your MysQL server once and then retry.
... View more
08-21-2019
04:53 PM
@harry_li You just posted a Community Thread successfully. So are you able to post new topics now without any issue? Are you posting contents with some external links Or special characters ? Can you try posting a Test Topic and then if you find any error then can you please share the error that you are getting?
... View more
08-21-2019
04:35 PM
@nanibigdata1 Are you talking about Azure Scale Down approach in which Azure deletes the unwanted hosts from the cluster when they are not needed ? Ideally Azure should take care of deleting unwanted hosts from ambari as well. But if that is not happening then pcan you please help in understanding what is happening? 1. How was the node deleted from Azure? Using scale down approach or just deleted the node manually? 2. Before deletion of a Node from ambari cluster it is needed to first decommission the components running on the node like (Datanode, Nodemanager ..etc), If node has Master components then it needs to be moved to other node first. Then the components are stopped and then the components/host is deleted safely. So is it not happening at your end? 3. Do you see that even after the steps mentioned in point-2 the Host is still being listed in the "hosts" table of ambari OR being listed in the Ambari API call? # curl -u admin:admin -H "X-Requested-By:ambari" -X GET http://$AMBRI_FQDN:8080/api/v1/clusters/$CLUSTER_NAME/hosts
(OR) from ambari Database by running the following query:
# SELECT * FROM hosts; Something Similar but not for Azure Env so please do not follow it as it is: https://community.cloudera.com/t5/Support-Questions/How-can-I-delete-the-host-from-ambari-server/m-p/206396 *NOTE:*. If Azure has scaled down the ambari cluster (means removes some nodes from Ambari Cluster) But by any chance that host still running the ambari-agent then the agent might keep sending the registration request & heartbeat to ambari server. So please check if you still see the following kind of messages in your ambari-server.log even after the node is moved out of the ambari cluster. How usually logging appears in ambari-server.log when a node is deleted from cluster: Decommissioning DATANODE on example.host1.com
Decommissioning NODEMANAGER example.host1.com
Received Delete request for host example.host1.com from cluster ExampleCluster.
Removing hosts [[example.host1.com]] from available hosts on hosts removed event. But after the above messages if you still see the following kind of message appear in ambari-server.log. "Agent is still heartbeating" then it indicates that the Ambari Agent is still running on the node which is removed from the cluster and hence will be keep sending the registration/heartbeat request to ambari server so you might see an entry in the "hosts" table in ambari DB for that host. In this case ideally Azure or your whatever should have stopped the ambari-agent properly on the node immediately after host deletion. HeartBeatHandler:185 - Host: example.host1.com not found. Agent is still heartbeating.
Received host registration, host=[hostname=example.host1.com,.............,agentVersion=2.x.y.x
TopologyManager.onHostRegistered: Entering So if you still see that even after the Azure Node deletion it is keep showing the old host in ambari then it might be because the agent was keep running even after deletion for some time.
... View more
08-21-2019
05:31 AM
@Manoj690 Not the same output as mine. Please notice that, In your case that file is being owned by "root" user. Not the "hdfs" user. So please try this: # chown hdfs:hadoop /hadoop/hdfs/namenode/current/VERSION
(OR recursively)
# chown -R hdfs:hadoop /hadoop/hdfs/
... View more
08-21-2019
05:14 AM
@Manoj690 We see the error as following: java.io.FileNotFoundException: /hadoop/hdfs/namenode/current/VERSION (Permission denied) So that is causing the NameNode startup failure. Please check and fix the permission on the mentioned file so that the Owner of the NameNode process will have read / write access to this file. Normally it is owned by hdfs user. Example: # ls -l /hadoop/hdfs/namenode/current/VERSION
-rw-r--r--. 1 hdfs hadoop 206 Aug 13 23:10 /hadoop/hdfs/namenode/current/VERSION
... View more
08-21-2019
05:00 AM
@Manoj690 You can find NameNode logs here: # less /var/log/hadoop/hdfs/hadoop-hdfs-namenode-gaian-lap386.com.log
# less /var/log/hadoop/hdfs/hadoop-hdfs-namenode-gaian-lap386.com.out Similarly you can find the DataNode logs somewhere "/var/log/hadoop/hdfs/hadoop-hdfs-datanode-xxxxxxx.log" Can you please attach those logs here. Or please share the erros from the mentioned files. (preferred is to attach the files here)
... View more
08-21-2019
04:53 AM
@Manoj690 Looks like the NameNode is not starting successfully. Can you please check and share the NameNode logs. It looks like a similar thread is also opened by you here: https://community.cloudera.com/t5/Support-Questions/In-ambari-all-my-servies-are-down/m-p/268512/highlight/false#M206243 Please use the same troubleshooting steps to collect the data for troubleshooting. For example Please check if the NameNode is running or not? # ps -ef | grep -i NameNode If NameNode port is listening on 50070? # netstat -tnlpa | grep 50070 If not then please check and share the NameNode log. If possible then try to attach the Full NameNode logs.
... View more