I have been copying data from non-partitioned hive table to partitioned hive table but its giving following error and log message.
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask INFO : Number of reduce tasks is set to 0 since there's no reduce operator INFO : number of splits:2 INFO : Submitting tokens for job: job_1455271075351_0034 INFO : The url to track the job:http://quickstart.cloudera:8088/proxy/application_1455271075351_0034/INFO : Starting Job = job_1455271075351_0034, Tracking URL =http://quickstart.cloudera:8088/proxy/application_1455271075351_0034/INFO : Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1455271075351_0034 INFO : Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0 INFO : 2016-02-18 03:56:36,684 Stage-1 map = 0%, reduce = 0% INFO : 2016-02-18 03:56:45,448 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 2.35 sec INFO : 2016-02-18 03:57:10,663 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.35 sec INFO : MapReduce Total cumulative CPU time: 2 seconds 350 msec ERROR : Ended Job = job_1455271075351_0034 with errors
Showing java heap space fata error.
Excerpts of log
2016-02-18 03:57:10,320 FATAL [IPC Server handler 0 on 44265] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1455271075351_0034_m_000000_3 - exited : Java heap space
I am trying to copy just 16629 rows from source to destination partitioned table. I think partitioning is causing the issue. We have created partitions on Year , month, day , Account (used in most of the filters). It is trying to create a total of 2578 partitions for 16629 rows. I think this is the cause of the Java heap space error
I have tried setting set mapreduce.map.memory.mb=8192; set mapreduce.reduce.memory.mb=8192; on the hive cli.
In another attempt i have also tried EXPORT HADOOP_CLIENT_OPTS='-xMX8G'
Still inserting the data from once table to another is throwing the same error.
Any other pointer ?
@Nirvana India Those are numbers based on diff clusters. You have to find number for your cluster.
If you are supported customer then use smartsense.
I am still facing some weird issues. While trying to just copy data from one existing table to new table with Create table clone as select * from t_table, its working just perfect. On the other hand while trying to copy data from existing table to another existing table with Insert into table_clone select column1,col2.... from t_table, its throwing Heap space error. Source tables are same in both cases.
I have tried different size for the Container, Mapper, reducer, mapreduce.map.java.opts -Xmx5124m so on but its throwing same error every time.
few setting are :
yarn.scheduler.minimum-allocation-mb : 4GB
yarn.scheduler.maximum-allocation-mb : 6GB
Container memory : 18 GB
mapreduce.map.memory.mb : 6 GB
mapreduce.reduce.memory.mb : 8 GB
mapreduce.map.java.opts : -Xmx5124m
mapreduce.reduce.java.opts : -Xmx6144m
I am not able to copy data from non partitioned table to another non partitioned table. Though main requirement is to copy from non- partitioned table to partitioned table
@Nirvana India I noticed you're using Cloudera VM. Cloudera has a different approach to running Hive than Hortonworks. Unfortunately, we cannot provide you with a definitive answer as the underlying architecture is different. Unless you run your workload on Hortonworks Sandbox VM, you will most likely not solve your problems. Hive runs on Tez in Hortonworks distribution of Hadoop, on Cloudera, it does not. I suggest you try your queries on Sandbox and then we can definitely help you if you run into problems. Cheers.