I have impala running on a Linux box (Version 2.7) and Kudu is the storage manager. I am attempting to create a table via the docs in this fashion:
CREATE TABLE my_first_table ( id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO 16 BUCKETS TBLPROPERTIES( 'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler', 'kudu.table_name' = 'my_first_table', 'kudu.master_addresses' = 'kudu-master.example.com:7051', 'kudu.key_columns' = 'id' );
However, the "DISTRIBUTE BY HASH(id)" causes an invalid syntax near "DISTRIBUTE BY". This create table statement is coming directly from the docs (http://kudu.apache.org/docs/kudu_impala_integration.html#kudu_impala_insert_bulk). Any idea what I am doing wrong? I did notice that "partition by" works, but not the distribute clause.
Hi, can you share the exact error you're receiving?
I assume your kudu master address is not kudu-master.example.com, have you tried setting that to the location of your server?
Sorry didnt answer all your questions: The complete error is this:
AnalysisException: Syntax error in line 1: ... id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO... ^ Encountered: IDENTIFIER Expected: CACHED, COMMENT, LOCATION, PARTITIONED, PRODUCED, ROW, STORED, TBLPROPERTIES, UNCACHED, WITH CAUSED BY: Exception: Syntax error
While Kudu is still in beta we have a sepecial version of Impala that we're calling 'Kudu_Impala'. If you're using the regular Impala 2.7 release you actually won't have the syntax for DISTRIBUTE BY. Can you try using the Impala version from these instructions: http://kudu.apache.org/docs/kudu_impala_integration.html