Support Questions

Find answers, ask questions, and share your expertise

Unable to connect to resourcemanager at 0.0.0.0:8030 when running query on beeline with tez

avatar
Explorer

Hi

     I have hadoop cluster with namenode resourcemanager on on server, datanode o another server and hive, tez on different server.

I am getting error on running query on beeline -  below are the yarn logs - it keeps trying to connect

2024-10-31 15:57:49,806 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |rm.TaskSchedulerManager|: Creating TaskScheduler: Local TaskScheduler with clusterIdentifier=111101111
2024-10-31 15:57:49,813 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |rm.YarnTaskSchedulerService|: YarnTaskScheduler initialized with configuration: maxRMHeartbeatInterval: 1000, containerReuseEnabled: true, reuseRackLocal: true, reuseNonLocal: false, localitySchedulingDelay: 250, preemptionPercentage: 10, preemptionMaxWaitTime: 60000, numHeartbeatsBetweenPreemptions: 3, idleContainerMinTimeout: 5000, idleContainerMaxTimeout: 10000, sessionMinHeldContainers: 0
2024-10-31 15:57:49,817 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |client.RMProxy|: Connecting to ResourceManager at /0.0.0.0:8030
2024-10-31 15:57:50,834 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |ipc.Client|: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-10-31 15:57:51,836 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |ipc.Client|: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-10-31 15:57:52,837 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |ipc.Client|: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. 

 

few troubleshoots i have done,

checked the yarn-site.xml file on all instances

hostname and all three addresses are mentioned

 

<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node1</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>node1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>node1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>node1:8031</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>node1:59392</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>124491</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>125</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>50115</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>54</value>
</property>

</configuration>

 

also checked

telnet node1 8030

this is working

ping node1

this also works

 

checked /etc/hosts

this also seems to be fine

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

Thanks for suggestion, the issue has been resolve,  we had aaded new datanode after that we had restarted the namenode, resourcemanager, datanode, node manager, but not hiveserver,  because of which configuration was not loaded on hive properly, after restart it started working.

View solution in original post

3 REPLIES 3

avatar
Community Manager

@rsurti, Welcome to our community! To help you get the best possible answer, I have tagged our experts @asish @udeshmukh who may be able to assist you further.

Please feel free to provide any additional information or details about your query. We hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Contributor

@rsurti 

ApplicationMaster is trying to connect to the ResourceManager on the same host (localhost / any interface, which is 0.0.0.0) and as it cannot connect to the RM it is failing.

 

2024-10-31 15:57:52,837 [INFO] [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerManager] |ipc.Client|: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. 

The above suggests a misconfiguration - YARN config files missing / or not having proper contents on those hosts.

 

Have you performed  CM>Yarn> Actions> Deploy Client configurations  ? If not, could you try this ?

 

@VidyaSargur We might need yarn experts in this.

avatar
Explorer

Thanks for suggestion, the issue has been resolve,  we had aaded new datanode after that we had restarted the namenode, resourcemanager, datanode, node manager, but not hiveserver,  because of which configuration was not loaded on hive properly, after restart it started working.