Posts: 10
Registered: ‎03-25-2017

How HDFS react if a file is updated



If i have a file on hdfs and i have done some updation in it and i do know that hdfs internally make sure to update the replicated data as well. But i want to know how hdfs do it(i mean the workflow).


For example.:

I have a csv file of 500MB, DFS block size 128MB and replication factor is 2. The file will get distributed into 4blocks and replicated to all the nodes.Now if i upate some values in file,what hdfs does to update all respective replicated data? 


Please help. 

Posts: 1,483
Kudos: 241
Solutions: 225
Registered: ‎07-31-2013

Re: How HDFS react if a file is updated

HDFS is an append-only filesystem. You cannot alter a pre-written file,
only replace it on the whole (i.e. delete or truncate, and then rewrite the
whole). Random writes at arbitrary offsets, such as may be doable in most
Linux filesystems, are not possible in HDFS.

I'm not sure what you mean by 'updation' (no such noun BTW), but if you're
talking of the Hue's Edit File feature which lets you edit small files, it
basically replaces the whole file by swapping the original with a copy.
Backline Customer Operations Engineer