About prasannasaraf18

prasannasaraf18 · ‎07-05-2018

Short Answer: Turn off scatter gather Long Version: The data transfer b/n container and shuffle service happens through RPC Calls(ChunkFetchRequest, ChunkFetchSuccess and ChunkFetchFailure) On further debugging with trace level logs, we found that RPC calls were indeed happening b/n the container and the shuffle service and after some time the RPC call's were abruptly suppressed(meaning no more RPC calls were logged) from both shuffle service and container. On looking into kernel and system activity logs we found the following xen_netfront: xennet: skb rides the rocket: 19 slots That means that our ec2 machines were having network packet loss. More info on this log can be found in the following thread http://www.brendangregg.com/blog/2014-09-11/perf-kernel-line-tracing.html So we tried turning off the scatter-gather using the following command. sudo ethtool -K eth0 sg off The error was gone after that.

TarunParimi · ‎05-03-2018

I didn't notice that you were only setting YARN_RESOURCEMANAGER_OPTS. This env variable is used for only the resourcemanger daemon. So to specify the opts for all hadoop and yarn client commands, you can use HADOOP_CLIENT_OPTS in . hadoop-env.sh . export HADOOP_CLIENT_OPTS="-Dyarn.resourcemanager.hostname=192.168.33.33" But I am not sure why you would need to this when you can just set it in the yarn-site.xml, which is what is recommended.

Online	Offline
Last Visited	‎07-07-2018 05:18 AM

Member Since	‎04-30-2018 05:51 AM
Last Visited	‎07-07-2018 05:18 AM
Posts	4
Kudos received	1

Cloudera Community

Re: External Shuffle service connection idle for m...

Re: External Shuffle service connection idle for m...

Re: Yarn commands not reading resource manager hos...