Member since
12-08-2015
24
Posts
22
Kudos Received
6
Solutions
02-15-2016
07:19 AM
I think I've got my DDL right, but I don't have Spark (via Zeppelin) seeing the CSVSerDe. Any thoughts? I've tried playing around with various driver-class-path and library-class-path settings both in the Zeppelin interpreter settings and in the Spark configuration via Ambrai, but haven't figured this one out yet.
Specifically, this is via Zeppelin on the the HDP Sandbox 2.3.2.0-2950.3.2.0-2950 %sql
CREATE EXTERNAL TABLE demo_table ( rid STRING,
hospital_name STRING,
hospital_type STRING )
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
STORED AS TEXTFILE
LOCATION '/user/zeppelin/directory' TBLPROPERTIES ('skip.header.line.count'='1')
org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hadoop.hive.serde2.OpenCSVSerde
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:350)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:327)
... View more
02-14-2016
09:24 PM
I'd like to be able to use the CSVSerDe from within Spark SQL. Do you know what configuration changes need to be made (on the Sandbox or otherwise) for Spark to have the CSVSerDe in it's class path?
... View more