About bjorn.jonsson

bjorn.jonsson · ‎10-11-2017

Hi Rabk, What the log4j WARN message provided shows is a task thats failing with a FetchFailedException because a shuffle file (shuffle_0_2_0.index) can't be found, it does not show what the job fails with or what transpires during the job run. But lets assume that the job fails when a stage has failed 4 times because of the fetch failures. One possible cause for the FetchFailedException is that you are running out of space on Nodemanager local-dirs (where shuffle files are stored), so look at the Nodemanager logs (on datanode 2), from the timeframe when the job ran, for bad disk/local-dirs messages. When this happens, YARN will send a SIGTERM to the containers/executors, like you have observed. Are you able to run the query on just a fraction of the current data or does it succeed if no other jobs are running on the cluster?

bjorn.jonsson · ‎08-10-2015

Hi, As described in the sort based shuffle design doc (https://issues.apache.org/jira/secure/attachment/12655884/Sort-basedshuffledesign.pdf), each map task should generate 1 shuffle data file 1 index file. Regarding your second question, the property to specify the buffer for shuffle data is "spark.shuffle.memoryFraction". This is discussed in more detail in the following Cloudera blog: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ Regards, Bjorn

bjorn.jonsson · ‎02-25-2015

Hi, The stack trace reported here is identical to MAPREDUCE-5799. Its a classpath issue that can be resolved by adding the following property to your client configurations: <property> <name>yarn.app.mapreduce.am.env</name> <value>LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native</value> </property>

Online	Offline
Last Visited	‎02-14-2020 09:04 AM

Member Since	‎07-20-2014 05:32 PM
Last Visited	‎02-14-2020 09:04 AM
Posts	39
Kudos received	4

Cloudera Community

Re: Number of intermediate files with Sort shuffle...

Re: Ubertasks fails finding Snappy

Re: Spark job running from a spark-shell fails wit...

Re: Number of intermediate files with Sort shuffle...

Re: Ubertasks fails finding Snappy