Reply
AKB
Explorer
Posts: 27
Registered: ‎04-11-2018
Accepted Solution

Impala SQL for KUDU does not work

CDH 5.14

 

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID int, fk_AgencyID int)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 4
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM default.fact_patientencounter;

 

Why does this CREATE TABLE statement not work? This documentation says it should work - https://kudu.apache.org/docs/kudu_impala_integration.html

 

CREATE TABLE AS SELECT

You can create a table by querying any other table or tables in Impala, using a CREATE TABLE …​ AS SELECT statement. The following example imports all rows from an existing table old_table into a Kudu table new_table. The names and types of columns in new_table will determined from the columns in the result set of the SELECT statement. Note that you must additionally specify the primary key and partitioning.

CREATE TABLE new_table
PRIMARY KEY (ts, name)
PARTITION BY HASH(name) PARTITIONS 8
STORED AS KUDU
AS SELECT ts, name, value FROM old_table;

 

Cloudera Employee
Posts: 19
Registered: ‎09-28-2015

Re: Impala SQL for KUDU does not work

Hard to say for certain without the error message, but it's likely that the parser doesn't like the column types being supplied in the PRIMARY KEY clause.

AKB
Explorer
Posts: 27
Registered: ‎04-11-2018

Re: Impala SQL for KUDU does not work

This is the actual query

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID, fk_AgencyID)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM default.fact_patientencounter

 

I dont see any errors in the log.

 

This is displayed


ERROR: AnalysisException: null
CAUSED BY: IllegalStateException: null

AKB
Explorer
Posts: 27
Registered: ‎04-11-2018

Re: Impala SQL for KUDU does not work

Let me ask the question better, where should I be looking for logs?

 

cdh1 and cdh2 are masters

cdh3-6 are tablet servers

 

Running from impalad on cdh4.

 

The master logs dont show any errors for this transaction.

AKB
Explorer
Posts: 27
Registered: ‎04-11-2018

Re: Impala SQL for KUDU does not work

I had to do this in many steps, why does CTAS option not work?

 

DROP TABLE IF EXISTS kudu_db.fact_patientencounter_kudu;

CREATE TABLE kudu_db.fact_patientencounter_kudu
(
pk_PatientEncounterID int,
fk_AgencyID int,
measure_ChuteTime_TimeInSeconds bigint,
measure_Response_TimeInSeconds bigint,
PRIMARY KEY(pk_PatientEncounterID, fk_AgencyID)
)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU;

 

INSERT INTO kudu_db.fact_patientencounter_kudu
SELECT pk_PatientEncounterID, fk_AgencyID, measure_ChuteTime_TimeInSeconds, measure_Response_TimeInSeconds FROM default.fact_patientencounter;

Cloudera Employee
Posts: 19
Registered: ‎09-28-2015

Re: Impala SQL for KUDU does not work

Hi AKB,

 

Nothing is expected to show up in Kudu logs, because the failure is occurring in the Impala query analysis stage.  I haven't been able to reproduce something similar using CDH 5.14, for instance the following queries complete successfully:

 

CREATE TABLE fact_patientencounter
(
pk_PatientEncounterID int,
fk_AgencyID int,
measure_ChuteTime_TimeInSeconds bigint,
measure_Response_TimeInSeconds bigint
)
STORED AS PARQUET;

CREATE TABLE fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID, fk_AgencyID)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM fact_patientencounter;

@tmarshalldo you have any ideas about where that null analysis exception might be coming from?

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: Impala SQL for KUDU does not work

Looks like https://issues.apache.org/jira/browse/IMPALA-6454 which was recently fixed.

 

The problem is with the casing of primary key or partition column names. Should work if you convert all column names to lower case in your CTAS.

AKB
Explorer
Posts: 27
Registered: ‎04-11-2018

Re: Impala SQL for KUDU does not work

DROP TABLE IF EXISTS kudu_db.fact_patientencounter_kudu;

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_patientencounterid, fk_agencyid)
PARTITION BY HASH(fk_agencyid) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_patientencounterid, fk_agencyid FROM default.fact_patientencounter

 

The above does work. Thanks for the quick replies on this.

Announcements