- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can I delete multiple files from hdfs?
- Labels:
-
Apache Hadoop
Created ‎03-22-2019 01:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Currently we are seeing our HDFS DFS directory is getting filled up and we have to remove the data at faster rate.
We currently have 12 datanodes and 4 masternodes 1 edgenode. Can I delete the files from HDFS from masternodes and edgenodes at once? I have created a script on edgenode which deletes the HDFS files but speed is really slow.
How can I delete multiple files at a time ? Can I place that script on multiple server and delete the files?
Created ‎03-23-2019 12:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Madhura Mhatre!
You can try to use tHDFSList to iterate each file that you want to delete.
For example:
tHDFSList--iterate--tHDFSDelete
There is a global variable that stores the current file path on tHDFSList:
(String)globalMap.get("tHDFSList_1_CURRENT_FILEPATH")
Set the file path of tHDFSdDelete with this variable.
Also please check this article on StackOverflow
This video
I hope it helps!
Created ‎03-23-2019 12:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, Madhura Mhatre!
You can try to use tHDFSList to iterate each file that you want to delete.
For example:
tHDFSList--iterate--tHDFSDelete
There is a global variable that stores the current file path on tHDFSList:
(String)globalMap.get("tHDFSList_1_CURRENT_FILEPATH")
Set the file path of tHDFSdDelete with this variable.
Also please check this article on StackOverflow
This video
I hope it helps!
Created ‎02-12-2020 01:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have similar scenario as you had. Did you run script to all node to delete files parallel with different parameters ?
I have created script that first collect all file name and add into array then loop through deleting those files, since it is deleting one by one is there any way to delete all at once ?
Created ‎02-13-2020 03:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @ppatel,
As this thread is older and was marked 'Solved' in March of 2019 you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details about your script that could aid others in providing a more relevant answer to your question.
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
