Support Questions

Find answers, ask questions, and share your expertise

Is there any way can we list only directory & sub directory names in HDFS ( excluding other details)?



As you can see in the ls documentation page, the command returns these information for a file :

permissions number_of_replicas userid groupid filesize modification_date modification_time filename

and these information for a directory :

permissions userid groupid modification_date modification_time dirname

There's no option to limit the output to only files or directories names directly in HDFS. However, you can use sed and cut to manipulate the output and get only the files names (example token from here) :

hadoop fs -ls /tmp | sed '1d;s/  */ /g'| cut -d\  -f8


Thank you for reply, This command works excellent, I truly appreciate your response. Is there is any way I can list only directories?. This command returning including directories also.


I am trying to use below command

hadoop fs -ls /tmp | sed '1d;s/ */ /g' |grep drwx | cut -d\ -f8

I added grep drwx command your suggestion. It seems to be working fine, but not sure if this covered all cases...

Thank you very much.


My pleasure. This won't work for cases where the directory rights are different from drwx. To have only files with your approach, you need to grep all lines starting with 'd'. You can do it like this:

hadoop fs -ls /tmp | sed '1d;s/  */ /g'| grep '^d' | cut -d\  -f8
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.