Support Questions

Find answers, ask questions, and share your expertise

How to load delta data on daily basis into Hive table which uses Druid storage handler

avatar
Explorer

Hi,

I am creating a table as following. The data till say, 10-apr-2018, is loaded. How do i load data from 11apr to latest day?

If i do insert into table test_druid, it is failing. Do i need to drop the month segment (apr-18) and load the data again for entire apr-18 month? If so, can you please give steps on how to do from hive.

I am using beeline to do all my operations.

CREATE TABLE test_druid STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler' TBLPROPERTIES ("druid.datasource"="test_druid","druid.segment.granularity"="MONTH","druid.query.granularity"="DAY") asselect cast(trans_date as timestamp)as`__time`, col1, col2, col3 from testdb.test_hive_Table where to_date(trans_Date)>='2018-01-01';

3 REPLIES 3

avatar
Expert Contributor

use Inset into statement

create table test_table(`timecolumn` timestamp, `userid` string, `num_l` float);
insert into test_table values ('2015-01-08 00:00:00', 'i1-start', 4);
CREATE TABLE druid_table (`__time` timestamp, `userid` string, `num_l` float)
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.segment.granularity" = "DAY");
INSERT INTO TABLE druid_table
select cast(`timecolumn` as timestamp) as `__time`, `userid`, `num_l` FROM test_table;

avatar
Explorer

When we create table as follows:

CREATE TABLE druid_table (`__time` timestamp,`userid`string,`num_l`float)STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'TBLPROPERTIES ("druid.segment.granularity"="DAY");
We are getting following error:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: java.io.FileNotFoundException: File /tmp/workingDirectory/.staging-hive_20180507130925_227f2e48-d049-464e-b2cd-43009b3398b3/segmentsDescriptorDir does not exist. (state=08S01,code=1)

Can you please help? 

avatar

Hello,

When I am trying to create a table from beeline using Druid Storage, I get the below error. Could you please guide on how to proceed further?

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: java.sql.SQLException: Cannot create PoolableConnectionFactory (Access denied for user)