Using Spark-sql with Spark-2.2.0, the following query results in an error:
Query (as printed by spark exception in the console):
CREATE EXTERNAL TABLE IF NOT EXISTS `databaseName`.`tableName` (some field names . . .) PARTITIONED BY (`tenant` STRING, `year` STRING, `month` STRING, `day` STRING, `hour` STRING, `minute` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '~' LINES TERMINATED BY '
' STORED AS ORC LOCATION 'hdfs://clusterName:8020/StorageLocation/'
Error: org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: ROW FORMAT DELIMITED is only compatible with 'textfile', not 'orc'(line 1, pos 0)
This error does not occur when using HiveQL using Hive CLI or when running this query in Hive View via Ambari, or even through hive jdbc. Why does this cause an error in Spark-SQL?
I believe that the answer is the SQL with
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '~' LINES TERMINATED BY '^^^'
Is simply unsupported HiveQL - and this should be unsupported as it is not used by the ORC format.