Reply
Highlighted
New Contributor
Posts: 4
Registered: ‎06-19-2017

Spark SQL insert overwrite with avro format?

I'm having some issues insert overwriting in spark with avro format tables. No matter what I do, I always seem to get the same error. This occurs on queries as simple as "INSERT OVERWRITE TABLE avroTable SELECT * FROM avroTable", and anything more complex as long as the target table is avro. Does anyone know of a solution to this?

 

Here is the error in question:

 

An error occurred while calling o43.sql.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1020.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1020.0 (TID 4790): org.apache.hadoop.hive.serde2.SerDeException: Encountered exception determining schema. Returning signal schema to indicate problem: null
	at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:523)
	at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:97)
	at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:88)
	at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:81)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sql$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.scala:92)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:84)
	at org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiveFile$3.apply(InsertIntoHiveTable.scala:84)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:89)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

 

 

Announcements