Support Questions

Find answers, ask questions, and share your expertise

HDFS command find argument type

avatar
Contributor

Hello,

 

For an application, I need to extract the maximum depth from an hdfs directory. I know how to do this action in shell: we can execute

 

 

 

 

 find /tmp -type d -printf '%d\n' | sort -rn | head -1

 

 

 

 

So I wanted to do the same with the find function of hdfs:

 

 

 

 

hdfs dfs -find /tmp -type d

 

 

 

 

 

but the -type argument does not exist on hdfs, here is the error:

 

 

 

 

find: Unexpected argument: -type

 

 

 

 

 

Does anyone have any solution or advice for this problem ?

ps: my hadoop version Hadoop 2.6.0-cdh5.13.

 

regards,

thanks in advance

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @Ellyly ,

Here is the example.

(1). Firstly, list -R and grep "^d" to show all the subdirectories in your path:

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d"
drwxr-xr-x - hdfs supergroup 0 2021-09-15 14:48 /folder1/folder2
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3/folder4
drwxr-xr-x - hdfs supergroup 0 2021-09-11 05:09 /folder1/subfolder1

(2). Then, awk -F\/ '{print NF-1}' to calculate each directory's depth, actually we print number of fields separated by /.

After -F it is \ and /, no space in between, it is not character"V" !!! 🙂 

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'
2
3
4
2

(3). Finally, sort and head

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'|sort -rn|head -1
4

 

Regards,

Will

If the answer helps, please accept as solution and click thumbs up.

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Hi @Ellyly ,

Here is the example.

(1). Firstly, list -R and grep "^d" to show all the subdirectories in your path:

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d"
drwxr-xr-x - hdfs supergroup 0 2021-09-15 14:48 /folder1/folder2
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3
drwxr-xr-x - hdfs supergroup 0 2021-09-15 15:01 /folder1/folder2/folder3/folder4
drwxr-xr-x - hdfs supergroup 0 2021-09-11 05:09 /folder1/subfolder1

(2). Then, awk -F\/ '{print NF-1}' to calculate each directory's depth, actually we print number of fields separated by /.

After -F it is \ and /, no space in between, it is not character"V" !!! 🙂 

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'
2
3
4
2

(3). Finally, sort and head

# sudo -u hdfs hdfs dfs -ls -R /folder1/ | grep "^d" | awk -F\/ '{print NF-1}'|sort -rn|head -1
4

 

Regards,

Will

If the answer helps, please accept as solution and click thumbs up.

avatar
Contributor

Thanks @willx, this solved my problem, work perfectly !!!