Created 11-11-2021 10:34 PM
Hello All,
I have a job running using Hive on Spark (Spark 1.6), and several times i got intermittent error as below, but if i run again the job success. Is there any solution for this case ?
Job aborted due to stage failure: Task 2 in stage 99.0 failed 4 times, most recent failure: Lost task 2.3 in stage 99.0 (TID 610, somenode.company.com, executor 34): java.lang.RuntimeException: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"column1":null,"column2":"ABCDEFGH","column3":null,"column4":null,"column5":null,"column6":null,"column7":0,"column8":null,"column9":"2021-11-07 17:43:59","column10":null,"column11":null,"column12":null)
at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:154)
at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:48)
at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27)
at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:95)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2022)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2022)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
Created 11-12-2021 02:12 AM
Provide spark app log and HMS log at that time.
Created 11-12-2021 04:00 AM
Try setting below parameter and then try to execute the query.
set hive.vectorized.execution.enabled=false;
set hive.vectorized.execution.reduce.enabled=false;
If this doesn't resolve then try
set hive.auto.convert.join=false;
Created 11-17-2021 09:18 PM
@sokrates, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
Regards,
Vidya Sargur,