- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how read and write is done in parallel manner in hdfs
- Labels:
-
Apache Hadoop
Created ‎03-27-2018 09:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎03-27-2018 10:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HDFS works on write once read many. It means only one client can write a file at a time. Multiple clients cannot write into an HDFS file at same time. When one client is given permission by Name node to write data on data node block, the block gets locked till the write operations is completed. If some other client requests to write on the same block of a particular file in data node, it is not permitted to do so. It has to wait till the write lock is revoked on a particular data node. All the requests are in the queue and only one client is allowed to write at a time.
These are the very good links regarding HDFS read and write operations, could you please refer to them
https://data-flair.training/blogs/hadoop-hdfs-data-read-and-write-operations/
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#Replication+Pipelining
https://data-flair.training/blogs/hdfs-data-write-operation/
Created ‎03-27-2018 10:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you please share more information like source of data ? And Type of data?
Do you want to process data and then store into HDFS?
There are lot of options like MR job, Spark and others.
-Shubham
Created ‎03-27-2018 09:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@TAMILMARAN c When you say read and write in parallel, do you mean reading a data which is In Progress to be written on to HDFS?
