- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDFS - MapReduce -> Basic Questions
- Labels:
-
Apache Hadoop
-
Cloudera DataFlow (CDF)
Created ‎10-10-2016 09:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- The Data File placing on HDFS is through MapReduce?
- All transactions in HDFS are using MapReduce jobs?
Anyone knows the answer? Many thanks!
Created ‎10-10-2016 09:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HDFS is used for storage so that it could provide data redundancy along with the support for parallelism at the time of read and write. mapreduce is a computation framework which allow you to process and generating large data sets with a parallel, distributed algorithm on a cluster. there are other framework also which can provide the same like spark and Tez.
for your specific questions
1. The Data File placing on HDFS is through MapReduce? you are not limited to write to HDFS using mapreduce only, you can take the advantage of the other framework to read and write.
2. All transactions in HDFS are using MapReduce jobs? NO
Created ‎10-10-2016 09:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HDFS is used for storage so that it could provide data redundancy along with the support for parallelism at the time of read and write. mapreduce is a computation framework which allow you to process and generating large data sets with a parallel, distributed algorithm on a cluster. there are other framework also which can provide the same like spark and Tez.
for your specific questions
1. The Data File placing on HDFS is through MapReduce? you are not limited to write to HDFS using mapreduce only, you can take the advantage of the other framework to read and write.
2. All transactions in HDFS are using MapReduce jobs? NO
Created ‎10-10-2016 09:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
But when you are putting a file into HDFS, you're using a MapReduce job (even if you don't see)?
Created ‎10-10-2016 09:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you mean to say like put file using hadoop fs -put? if so then 'no' it take the advantage of filesystem api to write on hdfs
