- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
parquet snappy file loading into hive
- Labels:
-
Apache Hive
Created ‎08-08-2017 12:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I transfered parquet file with snappy compression from cloudera system to hortonworks system.
I want to load this file into Hive
path /test/kpi
Command using from Hive 2.0
CREATE EXTERNAL TABLE tbl_test like PARQUET '/test/kpi/part-r-00000-0c9d846a-c636-435d-990f-96f06af19cee.snappy.parquet' STORED AS PARQUET LOCATION '/test/kpi';
ERROR: × org.apache.hive.service.cli.HiveSQLException:
Error while compiling statement: FAILED: ParseException line 1:44 cannot recognize input near 'PARQUET' ''/test/kpi/part-r-00000-0c9d846a-c636-435d-990f-96f06af19cee.snappy.parquet'' 'STORED' in table name
Created ‎08-08-2017 06:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @avinash midatani. I suspect the "LIKE PARQUET..." syntax is only valid in Impala.
Your CREATE TABLE SYNTAX might have to look more like this (with explicit column definitions and without the "LIKE PARQUET" block):
CREATE EXTERNAL TABLE tbl_test (col1 datatype1, col2 datatype2, ..., coln datatype3) STORED AS PARQUET LOCATION '/test/kpi';
I hope this helps.
Created ‎08-08-2017 06:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also read this HCC post for more information: https://community.hortonworks.com/questions/5833/create-hive-table-to-read-parquet-files-from-parqu....
Created ‎08-09-2017 07:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @bpreachuk for the update.
I am looking at a solution, which should automatically create table structure in Hive based on parquet files from cloudera.
I.e I want a solution in Hortonworks which can perform like Impala
Created ‎08-09-2017 09:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @avinash midatani. As mentioned in that other HCC post - this capability is not in Hive yet. The JIRA tracking the request is found here: https://issues.apache.org/jira/browse/HIVE-10593
The Spark code from @Alexander Bij found in the HCC post accomplishes that functionality - creating the Hive table structure automatically based on parquet file metadata. https://community.hortonworks.com/questions/5833/create-hive-table-to-read-parquet-files-from-parqu....
Created ‎08-09-2017 02:00 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Glad to hear this is still a useful workaround 🙂
