Support Questions

Find answers, ask questions, and share your expertise

Where's KUDU stores own tables?

avatar

Hi.

Where's KUDU stores own tables?

 

I create testing kudu-table 

CREATE TABLE impala_kudu.krg_test_table (
    id int not nULL,
    col1 int NULL,
    col2 STRING NULL,
    col3 float COMMENT 'only for test',
    PRIMARY KEY (id)
)
PARTITION BY HASH (id) PARTITIONS 10
COMMENT 'new testing kudu table'
STORED AS KUDU
TBLPROPERTIES ('test_arg'='test_value1','test_arg2'='test_value','comment'='arg comment','comment1'='arg comment','kudu.master_addresses'='hozd8:7051', 'kudu.num_tablet_replicas' = '1');

 

Then insert into it some data:

insert into impala_kudu.krg_test_table
    values(1,14,"qql",3.7);
insert into impala_kudu.krg_test_table
    values(2,59,"adasdasd",6);

 

Now i see in "hive metastore" table location: it shows me next

hdfs://hozd8:8020/user/hive/warehouse/impala_kudu.db/krg_test_table

But such file-path is empty!

 

Where's data? Where kudu stores it?

 

 

PS: i need to admin partitions - add it, drop it and so on... But i don't know where it stores, so i can't do anything.

4 REPLIES 4

avatar
Super Guru
Kudu is different from HDFS. The data you inserted into Impala KUDU table stored in Kudu side. You can check Kudu's table via Kudu's web UI or using Kudu commands:
https://kudu.apache.org/docs/command_line_tools_reference.html

And Kudu user guide can be found here:
https://www.cloudera.com/documentation/enterprise/latest/PDF/cloudera-kudu.pdf

I am not a kudu guy, so can't share more info, but hopefully you can get what you are after from the links.

But the operations you mentioned like add or drop partitions, you should be able to do it via Impala, that's the purpose of impala and kudu integration.

Cheers

avatar
Super Collaborator

EricL is correct, you don't need to worry about files with Kudu in the same way that you have to worry about them with typical Hive tables. Kudu stores its data directly on ext4 in a distributed way and does not use HDFS.

 

You can take a look at where Kudu is storing its data on the local file system if you go into Cloudera manager and take a look at how the --fs-data-dirs and --fs-wal-dir configuration options are set up across the various Tablet Server nodes.

 

Hope that helps,

Mike

 

avatar
Ok, thanks.
But we just kill our kudu-server. So we can use kudu only from Impala.

avatar
Super Collaborator

Kudu runs as a separate service that Impala talks to (like HDFS runs as a separate service from Impala) so you have to have Kudu running somewhere for it to work. However you don't have to run Kudu on the same servers that you run Impala on -- remote reads are supported over the network.