About bjorn.jonsson

rabk · ‎10-18-2017

Hi there, Thank you for following up. We have identified the cause and resolved it. Our DNS was setup incorrectly. This issue can be closed. Thank you, Vishal

bjorn.jonsson · ‎08-10-2015

Hi, As described in the sort based shuffle design doc (https://issues.apache.org/jira/secure/attachment/12655884/Sort-basedshuffledesign.pdf), each map task should generate 1 shuffle data file 1 index file. Regarding your second question, the property to specify the buffer for shuffle data is "spark.shuffle.memoryFraction". This is discussed in more detail in the following Cloudera blog: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ Regards, Bjorn

bjorn.jonsson · ‎02-25-2015

Hi, The stack trace reported here is identical to MAPREDUCE-5799. Its a classpath issue that can be resolved by adding the following property to your client configurations: <property> <name>yarn.app.mapreduce.am.env</name> <value>LD_LIBRARY_PATH=/opt/cloudera/parcels/CDH/lib/hadoop/lib/native</value> </property>

Online	Offline
Last Visited	‎02-14-2020 09:04 AM

Member Since	‎07-20-2014 05:32 PM
Last Visited	‎02-14-2020 09:04 AM
Posts	39
Kudos received	4

Cloudera Community

Re: Number of intermediate files with Sort shuffle...

Re: Ubertasks fails finding Snappy

Re: Spark job running from a spark-shell fails wit...

Re: Number of intermediate files with Sort shuffle...

Re: Ubertasks fails finding Snappy