Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Impala Kudu distribute by hash() not working

avatar
Frequent Visitor

I have impala running on a Linux box (Version 2.7) and Kudu is  the storage manager. I am attempting to create a table via the docs in this fashion:

 

CREATE TABLE my_first_table ( id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO 16 BUCKETS TBLPROPERTIES( 'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler', 'kudu.table_name' = 'my_first_table', 'kudu.master_addresses' = 'kudu-master.example.com:7051', 'kudu.key_columns' = 'id' );

 

However, the "DISTRIBUTE BY HASH(id)" causes an invalid syntax near "DISTRIBUTE BY". This create table statement is coming directly from the docs (http://kudu.apache.org/docs/kudu_impala_integration.html#kudu_impala_insert_bulk). Any idea what I am doing wrong? I did notice that "partition by" works, but not the distribute clause.

1 ACCEPTED SOLUTION

avatar
Frequent Visitor

Thanks MJ, will definitely try that and let you know.

View solution in original post

5 REPLIES 5

avatar
Contributor

Hi, can you share the exact error you're receiving?

 

I assume your kudu master address is not kudu-master.example.com, have you tried setting that to the location of your server?

 

Thanks,

Matt

avatar
Frequent Visitor

Yes, I did do that with the same error message.

avatar
Frequent Visitor

Sorry didnt answer all your questions: The complete error is this:

 

AnalysisException: Syntax error in line 1: ... id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO... ^ Encountered: IDENTIFIER Expected: CACHED, COMMENT, LOCATION, PARTITIONED, PRODUCED, ROW, STORED, TBLPROPERTIES, UNCACHED, WITH CAUSED BY: Exception: Syntax error

avatar
Contributor

While Kudu is still in beta we have a sepecial version of Impala that we're calling 'Kudu_Impala'. If you're using the regular Impala 2.7 release you actually won't have the syntax for DISTRIBUTE BY. Can you try using the Impala version from these instructions: http://kudu.apache.org/docs/kudu_impala_integration.html

 

Thanks

avatar
Frequent Visitor

Thanks MJ, will definitely try that and let you know.