Member since
09-11-2017
8
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
20034 | 09-27-2017 03:59 PM |
09-27-2017
04:01 PM
Please find the reply above for the question “what should I do in my custom hdfs file name format so that it can also work…”
... View more
09-27-2017
03:59 PM
Hi @parag dharmadhikari Answer to your question “what should I do in my custom hdfs file name format so that it can also work…” DefaultFileNameFormat#getName() uses the provided 'timestamp' to form the file name. So each time this is called with different timestamp, a new filename will be returned for different threads. Where as CustomFileNameFormat is not using the 'timestamp' to form the unique file name for multiple threads. So when same CustomFileFormat is used to from multiple threads, it returns same filename, resulting in above collision. So this might be the reason for your case, where it succeeds with DefaultFileNameFormat. Please provide some unique name for each thread in the CustomFileFormat
... View more
09-27-2017
03:56 PM
Hi @parag dharmadhikari
Answer to your question “how could I set overwrite flag.” Making the overwrite=false needs HdfsBolt changes. HdfsBolt calls simple API to create the file FileSystem#create(path), for which overwrite=true by default. HdfsBolt has to be changed to use the api FileSystem#create(Path, boolean overwrite)
... View more
09-25-2017
06:44 AM
1 Kudo
Hi @oula.alshiekh@gmail.com alshiekh , From the above stacktraces it looks like, socket timeouts are set to very less values as 300ms. Hadoop's default values are ReadTimeout=60000, WriteTimeout=8*60000 Please check below configurations in Datanode's configs, "dfs.client.socket-timeout" "dfs.datanode.socket.write.timeout" If the values are set to 300ms, then please increase these values and restart the datanodes.
... View more
09-25-2017
06:41 AM
1 Kudo
Hi @oula.alshiekh@gmail.com alshiekh , The error looks not related to number of threads running in the datanode. It really looks related to connection problem. It would be really helpful if you can provide more detaile stacktrace. My GUESS is that there could be chances that, Next datanode in the pipeline (given by the namenode) is down. So first datanode is not able to connect to next datanode and throwing the above mentioned exception. Since you have 6 datanodes, writes could be successful with remaining nodes in the cluster.
... View more
09-24-2017
02:53 PM
Hi @parag dharmadhikari It could be issue of Multiple threads trying to write the same file, if the overwrite flag is set to true while creating the file If the file created by the first thread is overwritten by thesecond thread, then first thread will experience the above exception. Solution1 : If your case is multiple threads, then setting ‘overwrite’ flag to false, will resolve the issue, Solution 2: If your case is not about creating files in multiple threads, please check whether some other client is deleting the file/parent directory.
... View more