- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Impala Kudu distribute by hash() not working
- Labels:
-
Apache Hive
-
Apache Impala
-
Apache Kudu
Created on ‎07-21-2016 02:29 PM - edited ‎09-16-2022 03:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have impala running on a Linux box (Version 2.7) and Kudu is the storage manager. I am attempting to create a table via the docs in this fashion:
CREATE TABLE my_first_table ( id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO 16 BUCKETS TBLPROPERTIES( 'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler', 'kudu.table_name' = 'my_first_table', 'kudu.master_addresses' = 'kudu-master.example.com:7051', 'kudu.key_columns' = 'id' );
However, the "DISTRIBUTE BY HASH(id)" causes an invalid syntax near "DISTRIBUTE BY". This create table statement is coming directly from the docs (http://kudu.apache.org/docs/kudu_impala_integration.html#kudu_impala_insert_bulk). Any idea what I am doing wrong? I did notice that "partition by" works, but not the distribute clause.
Created ‎07-25-2016 07:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks MJ, will definitely try that and let you know.
Created ‎07-21-2016 03:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, can you share the exact error you're receiving?
I assume your kudu master address is not kudu-master.example.com, have you tried setting that to the location of your server?
Thanks,
Matt
Created ‎07-22-2016 07:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I did do that with the same error message.
Created ‎07-22-2016 01:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry didnt answer all your questions: The complete error is this:
AnalysisException: Syntax error in line 1: ... id BIGINT, name STRING ) DISTRIBUTE BY HASH (id) INTO... ^ Encountered: IDENTIFIER Expected: CACHED, COMMENT, LOCATION, PARTITIONED, PRODUCED, ROW, STORED, TBLPROPERTIES, UNCACHED, WITH CAUSED BY: Exception: Syntax error
Created ‎07-22-2016 03:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
While Kudu is still in beta we have a sepecial version of Impala that we're calling 'Kudu_Impala'. If you're using the regular Impala 2.7 release you actually won't have the syntax for DISTRIBUTE BY. Can you try using the Impala version from these instructions: http://kudu.apache.org/docs/kudu_impala_integration.html
Thanks
Created ‎07-25-2016 07:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks MJ, will definitely try that and let you know.
