Hi,
I want to find the space required for storing Kudu Table Backup using spark-submit command.
How it is related to Kudu Table On disk size and how to calculate the space required to store the Backup of Kudu table?
I am using the following command as reference to run Kudu backup for a table -
spark-submit --class org.apache.kudu.backup.KuduBackup [***FULL PATH TO kudu-backup2_2.11-1.12.0.jar***] --kuduMasterAddresses [***KUDU MASTER HOSTNAME 1***]:7051,[***KUDU MASTER HOSTNAME 2***]:7051 --rootPath file:/// [***DIRECTORY TO USE FOR BACKUP***] impala::[***DATABASE NAME***].foo
https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/kudu-recovery/topics/kudu-backing-up-tables.h...
To check the Kudu table On disk size I used the statistics kudu command -
kudu table statistics <master_addresses> <table_name>
https://kudu.apache.org/docs/command_line_tools_reference.html#table-statistics
I have a table with following statistics -
TABLE sampleTable1
on disk size: 31667556474
live row count: 2000000000
on disk size limit: N/A
live row count limit: N/A
How much space will be required in rootPath to store the backup of this table?
Or how can we calculate the required space for storing backup?
Thank you