01-11-2019 03:59 AM
Where's KUDU stores own tables?
I create testing kudu-table
CREATE TABLE impala_kudu.krg_test_table ( id int not nULL, col1 int NULL, col2 STRING NULL, col3 float COMMENT 'only for test', PRIMARY KEY (id) ) PARTITION BY HASH (id) PARTITIONS 10 COMMENT 'new testing kudu table' STORED AS KUDU TBLPROPERTIES ('test_arg'='test_value1','test_arg2'='test_value','comment'='arg comment','comment1'='arg comment','kudu.master_addresses'='hozd8:7051', 'kudu.num_tablet_replicas' = '1');
Then insert into it some data:
insert into impala_kudu.krg_test_table values(1,14,"qql",3.7); insert into impala_kudu.krg_test_table values(2,59,"adasdasd",6);
Now i see in "hive metastore" table location: it shows me next
But such file-path is empty!
Where's data? Where kudu stores it?
PS: i need to admin partitions - add it, drop it and so on... But i don't know where it stores, so i can't do anything.
01-15-2019 10:52 PM
01-16-2019 08:46 AM
EricL is correct, you don't need to worry about files with Kudu in the same way that you have to worry about them with typical Hive tables. Kudu stores its data directly on ext4 in a distributed way and does not use HDFS.
You can take a look at where Kudu is storing its data on the local file system if you go into Cloudera manager and take a look at how the --fs-data-dirs and --fs-wal-dir configuration options are set up across the various Tablet Server nodes.
Hope that helps,
01-16-2019 08:55 AM
Kudu runs as a separate service that Impala talks to (like HDFS runs as a separate service from Impala) so you have to have Kudu running somewhere for it to work. However you don't have to run Kudu on the same servers that you run Impala on -- remote reads are supported over the network.