HDFS works on write once read many. It means only one client can write a file at a time. Multiple clients cannot write into an HDFS file at same time. When one client is given permission by Name node to write data on data node block, the block gets locked till the write operations is completed. If some other client requests to write on the same block of a particular file in data node, it is not permitted to do so. It has to wait till the write lock is revoked on a particular data node. All the requests are in the queue and only one client is allowed to write at a time.
These are the very good links regarding HDFS read and write operations, could you please refer to them
Could you please share more information like source of data ? And Type of data?
Do you want to process data and then store into HDFS?
There are lot of options like MR job, Spark and others.