Support Questions

HanzalaShaikh · ‎07-12-2021

Hello Experts,

We have identified that 2 records have been duplicated in our hive tables. We have taken the backup of tables in case if we need to rollback. But now when we run insert overwrite command (e.g. insert overwrite table demo select distinct * from demo;)on smallest table with raw volume of "570 GB" but we are getting the following error.

INFO : 2021-07-11 15:33:47,756 Stage-0_0: 122/122 Finished Stage-1_0: 70(+380,-64)/978
INFO : state = STARTED
INFO : state = FAILED
ERROR : Status: Failed
ERROR : FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
DEBUG : Shutting down query insert overwrite table raw_switch.partitiontable select distinct * from raw_switch.partitiontable
INFO : Completed executing command(queryId=hive_19660743242525_d9c3a756-452f-472c-a92e-2b966c37d0ce); Time taken: 4078.407 seconds
DEBUG : Shutting down query insert overwrite table raw_switch.partitiontable select distinct * from raw_switch.partitiontable
Error: Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask (state=08S01,code=3)

Please find below hive server2 logs:-

2021-07-11 15:33:49,834 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: [HiveServer2-Background-Pool: Thread-29919]: Call: delete took 30ms
2021-07-11 15:33:49,834 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-29919]: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

The default parameters of hive are as follows:-

hive.execution.engine=spark;
spark.executor.memory=12g;
spark.executor.cores=4;
hive.optimize.sort.dynamic.partition=true;
hive.exec.dynamic.partition.mode=strict;

Kindly suggest how to resolve this issue. Do we need to change any of the above default parameters or some other parameters which we have missed.
Hope, we are running the correct query of insert overwrite to remove duplicate records.

flowerbirds · ‎07-12-2021

I think you should check yarn logs for more details info.

Cloudera Community

Support Questions

Error while running Insert overwrite query on hive table