05-01-2019 04:02 PM
I can't find a definitive answer (yet) to this question:
Is "Partitioned By" compatible within the context of a CTAS type query?
I want to create a partitioned table using a CTAS type query using the Map Reduce engine. I could use Spark but typically use Map Reduce because of the volume of data. I do not have Impala available to me.
This documentation appears to suggest that I can (https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_create_table.html), but I can't get it to work (the documentation may pertain only to Impala which is not installed in my shop). See specific documentation snippet, below.
I'm running CDH 5.10. See additional details (far) below.
CREATE TABLE AS SELECT:
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] db_name.]table_name [PARTITIONED BY (col_name[, ...])] [COMMENT 'table_comment'] [WITH SERDEPROPERTIES ('key1'='value1', 'key2'='value2', ...)] [ [ROW FORMAT row_format] [STORED AS ctas_file_format] ] [LOCATION 'hdfs_path'] [TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)] [CACHED IN 'pool_name' [WITH REPLICATION = integer] | UNCACHED] AS select_statement
Subversion http://github.com/cloudera/hadoop -r d11d609073f120d283c34b9e95725c83c7468000
Compiled by jenkins on 2017-06-27T04:03Z
Compiled with protoc 2.5.0
From source with checksum e1845786b58ee858e84010f49db44e
This command was run using /opt/cloudera/parcels/CDH-5.10.2-1.cdh5.10.2.p3257.3508/jars/hadoop-common-2.6.0-cdh5.10.2.jar
05-02-2019 11:27 AM - edited 05-02-2019 11:28 AM
I understand that you are using CDH 5.10 which comes with hive version 1.1 .
here is the jira for the same https://issues.apache.org/jira/browse/HIVE-20241