Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Webhdfs REST API - Search for a filename or directory

How do I search for a filename or directory name recursively in a given path in WebHDFS?

For example: If I search for "hive" from the root directory "/", I expect a response with directory names or file names matching "hive" under "/" such as

/hdp/apps/2.6.0.1-89/hive
/hdp/apps/2.6.0.1-89/hive/hive.tar.gz
1 ACCEPTED SOLUTION

Guru

@vsubramanian,

I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.

For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.

Hope this helps.

View solution in original post

5 REPLIES 5

Contributor

Can you describe this question more clear?I can't understand this.

I have updated my question with an example. Please let me know if you still can't understand my question.

Super Mentor

@vsubramanian

Did you try the "recursive=true" option as following:

Example:

curl -X GET "http://erie1.example.com:50070/webhdfs/v1/user/admin?op=LISTSTATUS&recursive=true" 

It will search recursively the "/user/admin" and will list all the files/directories details. Then we can use "grep" to find the particular File or Directory from the output JSON response.

.

Hi @Jay SenSharma,

I tried the "recursive=true" parameter, but it doesn't give any different response. It is identical with the response from API without "recursive=true" and my question was to find a file or directory within a given directory. I have updated my question to give you more clarity. Hope it helps. Thanks!

Guru

@vsubramanian,

I'm afraid this is currently not possible. The only WebHDFS REST api that implements "recursive" query parameter is delete. So if you use recursive with delete, that would work. The ListStatus API here has not implemented any recursive logic.

For your use-case, you'll manually need to call curl with op=LISTSTATUS multiple times for each directory after parsing the output. Similar to what "ls -R" Shell command does currently.

Hope this helps.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.