Created on 08-29-2018 12:36 AM - edited 09-16-2022 06:38 AM
Hi,
some of the Hive queries started to fail with this error message, so I checked the web and community, and usually the response is the setting the autoconvert.join to false can help.
The error appears very quickly, almost immediately after the query is submitted. I know that Hive is trying to create a map-side join but I dont understand why it fails, when X months it was running ok. The actual tables are very small (<1000 rows) so I dont think the hash-table can cause the out-of-memory.
2018-08-29 04:23:46,468 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,468 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,468 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=acquireReadWriteLocks start=1535509426468 end=1535509426480 duration=12 from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Executing command(queryId=hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11): INSERT OVERWRITE TABLE `prod_work`.`table` SELECT `esub`.`yesterday` AS `yesterday`, `esub`.`product_group` AS `product_group`, `esub`.`priceplan` AS `priceplan`, `esub`.`brand` AS `brand`, `esub`.`brand_desc` AS `brand_desc`, `esub`.`segment` AS `segment`, `esub`.`product_payment_type` AS `product_payment_type`, `esub`.`esub` AS `esub`, `rnch`.`rnch` AS `rnch`, `gradd`.`gradd` AS `gradd` FROM `prod_work`.`tablea` `esub` LEFT JOIN `prod_work`.`tableb` `rnch` ON `esub`.`priceplan` = `rnch`.`priceplan` LEFT JOIN `prod_work`.`tablec` `gradd` ON `esub`.`priceplan` = `gradd`.`priceplan` 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Query ID = hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Total jobs = 1 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=TimeToSubmit start=1535509426468 end=1535509426480 duration=12 from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,480 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,481 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Starting task [Stage-6:MAPREDLOCAL] in serial mode 2018-08-29 04:23:46,481 INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Generating plan file file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10004/plan.xml 2018-08-29 04:23:46,481 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities> 2018-08-29 04:23:46,481 INFO org.apache.hadoop.hive.ql.exec.Utilities: [HiveServer2-Background-Pool: Thread-138349]: Serializing MapredLocalWork via kryo 2018-08-29 04:23:46,483 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=serializePlan start=1535509426481 end=1535509426483 duration=2 from=org.apache.hadoop.hive.ql.exec.Utilities> 2018-08-29 04:23:46,532 WARN org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.server2.idle.session.timeout_check_operation does not exist 2018-08-29 04:23:46,532 WARN org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.sentry.conf.url does not exist 2018-08-29 04:23:46,532 WARN org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.entity.capture.input.URI does not exist 2018-08-29 04:23:46,543 INFO org.apache.hadoop.hdfs.DFSClient: [HiveServer2-Background-Pool: Thread-138349]: Created token for hive: HDFS_DELEGATION_TOKEN owner=hive/ip-10-197-23-43.eu-west-1.compute.internal@DOMAIN.LOCAL, renewer=hive, realUser=, issueDate=1535509426541, maxDate=1536114226541, sequenceNumber=1381844, masterKeyId=457 on ha-hdfs:hanameservice 2018-08-29 04:23:46,544 INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Executing: /opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hadoop/bin/hadoop jar /opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/jars/hive-common-1.1.0-cdh5.13.3.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hive/auxlib/hive-exec-1.1.0-cdh5.13.3-core.jar,file:///opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hive/auxlib/hive-exec-core.jar -localtask -plan file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10004/plan.xml -jobconffile file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10005/jobconf.xml 2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: Execution failed with exit status: 1 2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: Obtaining error information 2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: Task failed! Task ID: Stage-6 Logs: 2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: /var/log/hive/hadoop-cmf-CD-HIVE-nRaFPvFN-HIVESERVER2-ip-10-197-23-43.eu-west-1.compute.internal.log.out 2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Execution failed with exit status: 1 2018-08-29 04:23:46,584 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask 2018-08-29 04:23:46,584 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=Driver.execute start=1535509426480 end=1535509426584 duration=104 from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,584 INFO org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Completed executing command(queryId=hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11); Time taken: 0.104 seconds 2018-08-29 04:23:46,584 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,594 INFO org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=releaseLocks start=1535509426584 end=1535509426594 duration=10 from=org.apache.hadoop.hive.ql.Driver> 2018-08-29 04:23:46,595 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-138349]: Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89) at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Is it probably related to the HiveServer memory configuration? Do I understand it correctly that this fail happens inside the HiveServer2 JVM, when it tries to build a hash-table for a map-side join?
Thanks
Created 08-30-2018 01:46 AM
Problem resolved: the issue was with the overall memory consumption on the HiveServer2.
The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory.
Created 08-30-2018 01:46 AM
Problem resolved: the issue was with the overall memory consumption on the HiveServer2.
The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory.