Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala SQL for KUDU does not work

SOLVED Go to solution
Highlighted

Impala SQL for KUDU does not work

Explorer

CDH 5.14

 

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID int, fk_AgencyID int)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 4
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM default.fact_patientencounter;

 

Why does this CREATE TABLE statement not work? This documentation says it should work - https://kudu.apache.org/docs/kudu_impala_integration.html

 

CREATE TABLE AS SELECT

You can create a table by querying any other table or tables in Impala, using a CREATE TABLE …​ AS SELECT statement. The following example imports all rows from an existing table old_table into a Kudu table new_table. The names and types of columns in new_table will determined from the columns in the result set of the SELECT statement. Note that you must additionally specify the primary key and partitioning.

CREATE TABLE new_table
PRIMARY KEY (ts, name)
PARTITION BY HASH(name) PARTITIONS 8
STORED AS KUDU
AS SELECT ts, name, value FROM old_table;

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Impala SQL for KUDU does not work

Master Collaborator

Looks like https://issues.apache.org/jira/browse/IMPALA-6454 which was recently fixed.

 

The problem is with the casing of primary key or partition column names. Should work if you convert all column names to lower case in your CTAS.

7 REPLIES 7

Re: Impala SQL for KUDU does not work

Cloudera Employee

Hard to say for certain without the error message, but it's likely that the parser doesn't like the column types being supplied in the PRIMARY KEY clause.

Re: Impala SQL for KUDU does not work

Explorer

This is the actual query

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID, fk_AgencyID)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM default.fact_patientencounter

 

I dont see any errors in the log.

 

This is displayed


ERROR: AnalysisException: null
CAUSED BY: IllegalStateException: null

Re: Impala SQL for KUDU does not work

Explorer

Let me ask the question better, where should I be looking for logs?

 

cdh1 and cdh2 are masters

cdh3-6 are tablet servers

 

Running from impalad on cdh4.

 

The master logs dont show any errors for this transaction.

Re: Impala SQL for KUDU does not work

Explorer

I had to do this in many steps, why does CTAS option not work?

 

DROP TABLE IF EXISTS kudu_db.fact_patientencounter_kudu;

CREATE TABLE kudu_db.fact_patientencounter_kudu
(
pk_PatientEncounterID int,
fk_AgencyID int,
measure_ChuteTime_TimeInSeconds bigint,
measure_Response_TimeInSeconds bigint,
PRIMARY KEY(pk_PatientEncounterID, fk_AgencyID)
)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU;

 

INSERT INTO kudu_db.fact_patientencounter_kudu
SELECT pk_PatientEncounterID, fk_AgencyID, measure_ChuteTime_TimeInSeconds, measure_Response_TimeInSeconds FROM default.fact_patientencounter;

Re: Impala SQL for KUDU does not work

Cloudera Employee

Hi AKB,

 

Nothing is expected to show up in Kudu logs, because the failure is occurring in the Impala query analysis stage.  I haven't been able to reproduce something similar using CDH 5.14, for instance the following queries complete successfully:

 

CREATE TABLE fact_patientencounter
(
pk_PatientEncounterID int,
fk_AgencyID int,
measure_ChuteTime_TimeInSeconds bigint,
measure_Response_TimeInSeconds bigint
)
STORED AS PARQUET;

CREATE TABLE fact_patientencounter_kudu
PRIMARY KEY (pk_PatientEncounterID, fk_AgencyID)
PARTITION BY HASH(fk_AgencyID) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_PatientEncounterID, fk_AgencyID FROM fact_patientencounter;

@tmarshalldo you have any ideas about where that null analysis exception might be coming from?

Re: Impala SQL for KUDU does not work

Master Collaborator

Looks like https://issues.apache.org/jira/browse/IMPALA-6454 which was recently fixed.

 

The problem is with the casing of primary key or partition column names. Should work if you convert all column names to lower case in your CTAS.

Re: Impala SQL for KUDU does not work

Explorer

DROP TABLE IF EXISTS kudu_db.fact_patientencounter_kudu;

CREATE TABLE kudu_db.fact_patientencounter_kudu
PRIMARY KEY (pk_patientencounterid, fk_agencyid)
PARTITION BY HASH(fk_agencyid) PARTITIONS 3
STORED AS KUDU
AS SELECT pk_patientencounterid, fk_agencyid FROM default.fact_patientencounter

 

The above does work. Thanks for the quick replies on this.