Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Question about map output

avatar
Contributor

I'm looking for clarification on something I've read in Hadoop: The Definitive Guide.  It states that map output is local and not in HDFS.  But the map task runs on the node where the data resides (usually) and that is HDFS, correct?  Or is it the case that the I/O done by the map is standard Java I/O and not something like hadoop fs -put?

 

Thanks in advance to all who answer.

Thanks in advance to all who reply.

Kevin
1 ACCEPTED SOLUTION

avatar
The map task's local output is not stored within HDFS, rather in temporary
directories on that specific node (see property mapreduce.cluster.local.dir)
written using standard file I/O

https://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-de...

Regards,
Gautam Gopalakrishnan

View solution in original post

1 REPLY 1

avatar
The map task's local output is not stored within HDFS, rather in temporary
directories on that specific node (see property mapreduce.cluster.local.dir)
written using standard file I/O

https://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-de...

Regards,
Gautam Gopalakrishnan