Support Questions

mike_bronson7 · ‎02-27-2018

we are performing now rolling upgrade to hdp-2.6.4

during the upgrade , upgrade was stop about services check isshue

on all services we have the same issue:

Python script has been killed due to timeout after waiting 300 secs

what are the solution to expand the timeout or else in order to avoid this problem ?

Michael-Bronson

jsensharma · ‎02-27-2018

@Michael Bronson

The default timeout value for python based service checks are defined as 300 seconds, so one option will be to try increasing the value to a higher value like 450 or 600 to see if it work:

# grep -A3 -B2 'service_check.py' /var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/metainfo.xml 
      <commandScript>
        <script>scripts/service_check.py</script>
        <scriptType>PYTHON</scriptType>
        <timeout>300</timeout>
      </commandScript>

.

BUT here the problem looks like there may be some issue from Yarn side, Because normally the service check does not take all the 300 seconds time. So it will be better to check the health of HDFS and Yarn by looking at the logs to see if there are any errors reported.

.

View solution in original post

jsensharma · ‎02-27-2018

@Michael Bronson

The default timeout value for python based service checks are defined as 300 seconds, so one option will be to try increasing the value to a higher value like 450 or 600 to see if it work:

# grep -A3 -B2 'service_check.py' /var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/metainfo.xml 
      <commandScript>
        <script>scripts/service_check.py</script>
        <scriptType>PYTHON</scriptType>
        <timeout>300</timeout>
      </commandScript>

.

BUT here the problem looks like there may be some issue from Yarn side, Because normally the service check does not take all the 300 seconds time. So it will be better to check the health of HDFS and Yarn by looking at the logs to see if there are any errors reported.

.

mike_bronson7 · ‎02-27-2018

YARN was fail in service check but not HDFS , second when we restart the yarn its restart successfully , so I guess service check can fail in spite service restart completed

Michael-Bronson

jsensharma · ‎02-27-2018

@Michael Bronson

Yes, service checks can still fail even after successful restart of yarn service. This is because the YARN service checks runs some jobs like following which might fail due to some memory issues (even though RM and NM might be running fine)

Example:

# yarn org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command ls -num_containers 1 -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar -timeout 300000 --queue default

.

So if service check is failijng then we should check the logs to find out why it failed, we might see some errors on the YARN logs indicating memory issue or container creation related issues or something else.

Cloudera Community

Support Questions

Rolling Upgrade proccess stop on service check