Support Questions

Alexander_Rumak · ‎07-31-2018

Using Spark-sql with Spark-2.2.0, the following query results in an error:

Query (as printed by spark exception in the console):

CREATE EXTERNAL TABLE IF NOT EXISTS `databaseName`.`tableName` (some field names . . .) PARTITIONED BY (`tenant` STRING, `year` STRING, `month` STRING, `day` STRING, `hour` STRING, `minute` STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '~' LINES TERMINATED BY '

^^^

' STORED AS ORC LOCATION 'hdfs://clusterName:8020/StorageLocation/'

Error: org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: ROW FORMAT DELIMITED is only compatible with 'textfile', not 'orc'(line 1, pos 0)

This error does not occur when using HiveQL using Hive CLI or when running this query in Hive View via Ambari, or even through hive jdbc. Why does this cause an error in Spark-SQL?

nramanaiah · ‎08-01-2018

This validation is intentionally added in spark with SPARK-15279. As it doesn't make sense to provide DELIMITERS for ORC | PARQUET files.

View solution in original post

Alexander_Rumak · ‎07-31-2018

I believe that the answer is the SQL with

ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ESCAPED BY '~' LINES TERMINATED BY '^^^'

Is simply unsupported HiveQL - and this should be unsupported as it is not used by the ORC format.

nramanaiah · ‎08-01-2018

This validation is intentionally added in spark with SPARK-15279. As it doesn't make sense to provide DELIMITERS for ORC | PARQUET files.

Cloudera Community

Support Questions

Why Row Format Delimited does not work with Spark SQL ORC Format?