- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
reducer tasks long time
- Labels:
-
Apache Hadoop
Created ‎02-17-2016 11:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi: what can i do to improve the time for the reducer???
I have 107 mapper and just 1 reduce, so, which parameters could i change??
maybe thoste?
mapreduce.job.counters.max
Thanks
Created ‎02-17-2016 12:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would look at setting intermediate compression from map tasks and ouput compression from reduce tasks
You can also look at using combiner class.
mapreduce.map.output.compress mapreduce.map.output.compress.codec
and output compression
mapreduce.output.fileoutputformat.compress.codec mapreduce.output.fileoutputformat.compress.type mapreduce.output.fileoutputformat.compress
Created ‎02-17-2016 11:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎02-17-2016 12:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you have support contract ?
Please install smartsense for better utilization of your cluster .
Created ‎02-17-2016 12:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, we dont have yet, but the smartsense is free??
Thanks
Created ‎02-17-2016 12:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho It's part of HDP and for supported customers. http://hortonworks.com/blog/introducing-hortonworks-smartsense/
Created ‎02-17-2016 12:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Its mapred.reduce.tasks, if you run a mapreduce program from the hadoop client you would set it like this:
-Dmapred.reduce.tasks=x
Pig and Hive have different ways to predict reducer numbers.
Created ‎02-17-2016 12:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this -Dmapred.reduce.tasks=x is for mapreduce1 iam using mapreduce2 and yarn and i dont know how to change this parameter.
anny suggestion??
Thanks
Created ‎02-17-2016 12:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Still works on yarn, the official new one is mapreduce.job.reduces but I always used the one above and he still takes it.
Created ‎02-17-2016 12:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would look at setting intermediate compression from map tasks and ouput compression from reduce tasks
You can also look at using combiner class.
mapreduce.map.output.compress mapreduce.map.output.compress.codec
and output compression
mapreduce.output.fileoutputformat.compress.codec mapreduce.output.fileoutputformat.compress.type mapreduce.output.fileoutputformat.compress
Created ‎02-17-2016 12:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Roberto Sancho here's a list of all deprecated mapred properties and new properties,
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/DeprecatedProperties.html
the property you're looking for is called mapreduce.job.reduces
