Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS command find argument type

avatar
Contributor

Hello,

 

For an application, I need to extract the maximum depth from an hdfs directory. I know how to do this action in shell: we can execute

 

 

 

 

 find /tmp -type d -printf '%d\n' | sort -rn | head -1

 

 

 

 

So I wanted to do the same with the find function of hdfs:

 

 

 

 

hdfs dfs -find /tmp -type d

 

 

 

 

 

but the -type argument does not exist on hdfs, here is the error:

 

 

 

 

find: Unexpected argument: -type

 

 

 

 

 

Does anyone have any solution or advice for this problem ?

ps: my hadoop version Hadoop 2.6.0-cdh5.13.

 

regards,

thanks in advance

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @Ellyly ,

Here is the example.

(1). Firstly, list -R and grep "^d" to show all the subdirectories in your path:

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d"
drwxr-xr-x - hdfs supergroup 0 2021-09-15 14:48 /folder1/folder2
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3/folder4
drwxr-xr-x - hdfs supergroup 0 2021-09-11 05:09 /folder1/subfolder1

(2). Then, awk -F\/ '{print NF-1}' to calculate each directory's depth, actually we print number of fields separated by /.

After -F it is \ and /, no space in between, it is not character"V" !!! 🙂 

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'
2
3
4
2

(3). Finally, sort and head

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'|sort -rn|head -1
4

 

Regards,

Will

If the answer helps, please accept as solution and click thumbs up.

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Hi @Ellyly ,

Here is the example.

(1). Firstly, list -R and grep "^d" to show all the subdirectories in your path:

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d"
drwxr-xr-x - hdfs supergroup 0 2021-09-15 14:48 /folder1/folder2
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3/folder4
drwxr-xr-x - hdfs supergroup 0 2021-09-11 05:09 /folder1/subfolder1

(2). Then, awk -F\/ '{print NF-1}' to calculate each directory's depth, actually we print number of fields separated by /.

After -F it is \ and /, no space in between, it is not character"V" !!! 🙂 

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'
2
3
4
2

(3). Finally, sort and head

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'|sort -rn|head -1
4

 

Regards,

Will

If the answer helps, please accept as solution and click thumbs up.

avatar
Contributor

Thanks @willx, this solved my problem, work perfectly !!!