Support Questions

Find answers, ask questions, and share your expertise

Kudu Backup Space Requirement

avatar
New Contributor

Hi,

I want to find the space required for storing Kudu Table Backup using spark-submit command.

How it is related to Kudu Table On disk size and how to calculate the space required to store the Backup of Kudu table?

I am using the following command as reference to run Kudu backup for a table - 

spark-submit --class org.apache.kudu.backup.KuduBackup [***FULL PATH TO kudu-backup2_2.11-1.12.0.jar***] --kuduMasterAddresses [***KUDU MASTER HOSTNAME 1***]:7051,[***KUDU MASTER HOSTNAME 2***]:7051 --rootPath file:/// [***DIRECTORY TO USE FOR BACKUP***] impala::[***DATABASE NAME***].foo


https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/kudu-recovery/topics/kudu-backing-up-tables.h...


To check the Kudu table On disk size I used the statistics kudu command -

kudu table statistics <master_addresses> <table_name>


https://kudu.apache.org/docs/command_line_tools_reference.html#table-statistics


I have a table with following statistics -


TABLE sampleTable1
on disk size: 31667556474
live row count: 2000000000
on disk size limit: N/A
live row count limit: N/A


How much space will be required in rootPath to store the backup of this table?
Or how can we calculate the required space for storing backup?


Thank you

1 REPLY 1

avatar
Expert Contributor

HI Team, if you are storing the back in HDFS like above -rootPath file:/// [***DIRECTORY TO USE FOR BACKUP***]  then use the below command:

hdfs dfs -du -s -h file:/// [***DIRECTORY TO USE FOR BACKUP***] 

if its different storage like ozone(ofs/o3) then also hdfs command will work if its S3 then use the aws command