Support Questions
Find answers, ask questions, and share your expertise

Zeppelin YARN Connection

Expert Contributor

I have realized from following los that Zeppelin doesn't connect YARN Resource manager. I have checked YARN conifgs on Ambari

yarn.resourcemanager.address=hadooptest.datalonga.com:8050

I have checked via telnet and I couldn't connect hadooptest.datalonga.com:8050 either.

Zeppelin seems running but I don't understand why zeppelin log directory are full of these kind of INFO logs?

The logs are taken from /var/log/zeppelin/zeppelin-interpreter-spark2-sirin_erkan-spark-zeppelin-hadooptest01

INFO [2018-07-06 14:45:58,298] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:45:59,299] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:00,300] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:01,302] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:02,303] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:03,304] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:04,305] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:05,306] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:06,308] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:07,309] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:08,311] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 10 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:09,312] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 11 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:10,313] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 12 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:11,315] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 13 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:12,316] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 14 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:13,317] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 15 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:14,318] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 16 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:15,319] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 17 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:16,321] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 18 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:17,322] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 19 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:18,323] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 20 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:19,324] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 21 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:20,325] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 22 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:21,326] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 23 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:22,328] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 24 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:23,329] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 25 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:24,331] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 26 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:25,332] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 27 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 INFO [2018-07-06 14:46:26,333] ({pool-2-thread-2} Client.java[handleConnectionFailure]:906) - Retrying connect to server: hadooptest.datalonga.com/10.07.07.3:8050. Already tried 28 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
7 REPLIES 7

@Erkan ŞİRİN

Zeppelin UI does not have any requirement for YARN so its expected to start and work fine even if YARN is down.

However, some interpreters like spark/spark2 will launch applications to yarn. Above log is for spark2 interpreter:

/var/log/zeppelin/zeppelin-interpreter-spark2-sirin_erkan-spark-zeppelin-hadooptest01

As part of submission of the spark2 application is normal to see these errors if there is a connectivity issue with yarn.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Expert Contributor

Thank you for your reply @Felix Albani. But the logs are INFO logs not ERROR. The problem is, generation of unnecessary INFO logs due to not connecting resource manager. On my Sandbox resource manager address is sandbox-hdp.hortonworks.com:8032 when I test it with telnet it works fine which does not on my prod env.

[root@sandbox-hdp starter]# telnet sandbox-hdp.hortonworks.com 8032 

Trying 172.17.0.2... Connected to sandbox-hdp.hortonworks.com. 

Escape character is '^]'.

@Erkan ŞİRİN yes is info while retrying then it will error once number of retries has exhausted. Check in yarn-site.xml what is your yarn.resourcemanager.address value. Make sure there is no firewall / network issue between zeppelin node and yarn.

Expert Contributor

Hi @Felix Albani. No network or firewall issues. How I find out it? I run following command on ResourceManager host:

netstat -an | grep 8050

Nothing.

So there is no service run on port 8050. So, nothing to listen.

@Erkan ŞİRİN in the above description you mentioned that yarn.resourcemanager.address=hadooptest.datalonga.com:8050, so if nothing is listening on that port perhaps RM is not starting correctly? also do you have RM HA?

HTH

Expert Contributor

Hi @Felix Albani I use YARN HA. All YARN services including ResourceManager up and running.

79486-resource-manager-up-and-running.png

@Erkan ŞİRİN Property yarn.resourcemanager.address is redundant when using yarn RM HA.

Please review this link https://community.hortonworks.com/questions/39671/what-is-the-default-yarn-resource-manager-port-is....

Based on this, perhaps you like to check on zeppelin host if the yarn-site.xml contains valid settings. Check under /etc/spark/conf directory to see if there might be a copy of the core-site.xml/yarn-site.xml that might be outdated.

HTH

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.