Created 04-16-2018 06:53 AM
Hi Gurus,
We have an S3 location with multiple directories and files. We would like to get the filename and their corresponding record count. We were able to get the filename and filesize using the below command:
hdfs dfs -ls -R /bucket_name/* | awk '{system("hdfs dfs -count " $8) }' | awk '{print $4,$3;}'
Output:
/bucket_name/Directory/File_name.txt 44998 --filesize
Thanks in advance!
Regards,
Surendran
Created 04-16-2018 07:09 AM
Created 04-16-2018 07:09 AM
Created 04-16-2018 07:35 AM
Created 04-16-2018 07:47 AM