Reply
New Contributor
Posts: 3
Registered: ‎04-16-2018

Recursively find record count for files in S3

Hi Gurus,

 

We have multiple directories and files in an S3 bucket. We would like to list the files and their corresponding record counts.

 

I am able to get the file name and size using the below command:

hdfs dfs -ls -R /bucket_name/Directory/* | awk '{system("hdfs dfs -count " $8) }' | awk '{print $4,$3;}'

 

The output format is:

/bucket_name/Directory/File_name.txt 44998 -- size

 

Is there a way we can get the filename and record count in a similar format?

 

Thanks in advance!

 

Regards,

Surendran

Posts: 1,639
Kudos: 314
Solutions: 254
Registered: ‎07-31-2013

Re: Recursively find record count for files in S3

Announcements