Support Questions

Find answers, ask questions, and share your expertise

Webhdfs REST API - Search for a filename or directory

avatar
Contributor

How do I search for a filename or directory name recursively in a given path in WebHDFS?

For example: If I search for "hive" from the root directory "/", I expect a response with directory names or file names matching "hive" under "/" such as

/hdp/apps/2.6.0.1-89/hive
/hdp/apps/2.6.0.1-89/hive/hive.tar.gz
1 ACCEPTED SOLUTION

avatar
Guru

@vsubramanian,

I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.

For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.

Hope this helps.

View solution in original post

5 REPLIES 5

avatar
Contributor

Can you describe this question more clear?I can't understand this.

avatar
Contributor

I have updated my question with an example. Please let me know if you still can't understand my question.

avatar
Master Mentor

@vsubramanian

Did you try the "recursive=true" option as following:

Example:

curl -X GET "http://erie1.example.com:50070/webhdfs/v1/user/admin?op=LISTSTATUS&recursive=true" 

It will search recursively the "/user/admin" and will list all the files/directories details. Then we can use "grep" to find the particular File or Directory from the output JSON response.

.

avatar
Contributor

Hi @Jay SenSharma,

I tried the "recursive=true" parameter, but it doesn't give any different response. It is identical with the response from API without "recursive=true" and my question was to find a file or directory within a given directory. I have updated my question to give you more clarity. Hope it helps. Thanks!

avatar
Guru

@vsubramanian,

I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.

For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.

Hope this helps.