Created 06-06-2017 10:12 PM
How do I search for a filename or directory name recursively in a given path in WebHDFS?
For example: If I search for "hive" from the root directory "/", I expect a response with directory names or file names matching "hive" under "/" such as
/hdp/apps/2.6.0.1-89/hive /hdp/apps/2.6.0.1-89/hive/hive.tar.gz
Created 06-07-2017 10:37 PM
I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.
For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.
Hope this helps.
Created 06-07-2017 08:48 AM
Can you describe this question more clear?I can't understand this.
Created 06-07-2017 06:03 PM
I have updated my question with an example. Please let me know if you still can't understand my question.
Created 06-07-2017 09:02 AM
Did you try the "recursive=true" option as following:
Example:
curl -X GET "http://erie1.example.com:50070/webhdfs/v1/user/admin?op=LISTSTATUS&recursive=true"
It will search recursively the "/user/admin" and will list all the files/directories details. Then we can use "grep" to find the particular File or Directory from the output JSON response.
.
Created 06-07-2017 06:06 PM
Hi @Jay SenSharma,
I tried the "recursive=true" parameter, but it doesn't give any different response. It is identical with the response from API without "recursive=true" and my question was to find a file or directory within a given directory. I have updated my question to give you more clarity. Hope it helps. Thanks!
Created 06-07-2017 10:37 PM
I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.
For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.
Hope this helps.