Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive error: "return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask"

avatar

Hi, 

 some of the Hive queries started to fail with this error message, so I checked the web and community, and usually the response is the setting the autoconvert.join to false can help. 

The error appears very quickly, almost immediately after the query is submitted. I know that Hive is trying to create a map-side join but I dont understand why it fails, when X months it was running ok. The actual tables are very small (<1000 rows) so I dont think the hash-table can cause the out-of-memory.

 

2018-08-29 04:23:46,468 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,468 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,468 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=acquireReadWriteLocks start=1535509426468 end=1535509426480 duration=12 from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Executing command(queryId=hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11): INSERT OVERWRITE TABLE `prod_work`.`table`  SELECT
    `esub`.`yesterday` AS `yesterday`,
    `esub`.`product_group` AS `product_group`,
    `esub`.`priceplan` AS `priceplan`,
    `esub`.`brand` AS `brand`,
    `esub`.`brand_desc` AS `brand_desc`,
    `esub`.`segment` AS `segment`,
    `esub`.`product_payment_type` AS `product_payment_type`,
    `esub`.`esub` AS `esub`,
    `rnch`.`rnch` AS `rnch`,
    `gradd`.`gradd` AS `gradd`
  FROM `prod_work`.`tablea` `esub`
  LEFT JOIN `prod_work`.`tableb` `rnch`
    ON `esub`.`priceplan` = `rnch`.`priceplan`
  LEFT JOIN `prod_work`.`tablec` `gradd`
    ON `esub`.`priceplan` = `gradd`.`priceplan`
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Query ID = hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Total jobs = 1
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=TimeToSubmit start=1535509426468 end=1535509426480 duration=12 from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,480 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,481 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Starting task [Stage-6:MAPREDLOCAL] in serial mode
2018-08-29 04:23:46,481 INFO  org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Generating plan file file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10004/plan.xml
2018-08-29 04:23:46,481 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
2018-08-29 04:23:46,481 INFO  org.apache.hadoop.hive.ql.exec.Utilities: [HiveServer2-Background-Pool: Thread-138349]: Serializing MapredLocalWork via kryo
2018-08-29 04:23:46,483 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=serializePlan start=1535509426481 end=1535509426483 duration=2 from=org.apache.hadoop.hive.ql.exec.Utilities>
2018-08-29 04:23:46,532 WARN  org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.server2.idle.session.timeout_check_operation does not exist
2018-08-29 04:23:46,532 WARN  org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.sentry.conf.url does not exist
2018-08-29 04:23:46,532 WARN  org.apache.hadoop.hive.conf.HiveConf: [HiveServer2-Background-Pool: Thread-138349]: HiveConf of name hive.entity.capture.input.URI does not exist
2018-08-29 04:23:46,543 INFO  org.apache.hadoop.hdfs.DFSClient: [HiveServer2-Background-Pool: Thread-138349]: Created token for hive: HDFS_DELEGATION_TOKEN owner=hive/ip-10-197-23-43.eu-west-1.compute.internal@DOMAIN.LOCAL, renewer=hive, realUser=, issueDate=1535509426541, maxDate=1536114226541, sequenceNumber=1381844, masterKeyId=457 on ha-hdfs:hanameservice
2018-08-29 04:23:46,544 INFO  org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Executing: /opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hadoop/bin/hadoop jar /opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/jars/hive-common-1.1.0-cdh5.13.3.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hive/auxlib/hive-exec-1.1.0-cdh5.13.3-core.jar,file:///opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/hive/auxlib/hive-exec-core.jar  -localtask -plan file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10004/plan.xml   -jobconffile file:/tmp/hive/dbb7f6a5-a988-466b-8170-58bbcc924fb9/hive_2018-08-29_04-23-46_256_3553242332203703514-2517/-local-10005/jobconf.xml
2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: Execution failed with exit status: 1
2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: Obtaining error information
2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]:
Task failed!
Task ID:
  Stage-6

Logs:

2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.Task: [HiveServer2-Background-Pool: Thread-138349]: /var/log/hive/hadoop-cmf-CD-HIVE-nRaFPvFN-HIVESERVER2-ip-10-197-23-43.eu-west-1.compute.internal.log.out
2018-08-29 04:23:46,583 ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask: [HiveServer2-Background-Pool: Thread-138349]: Execution failed with exit status: 1
2018-08-29 04:23:46,584 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
2018-08-29 04:23:46,584 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=Driver.execute start=1535509426480 end=1535509426584 duration=104 from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,584 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-138349]: Completed executing command(queryId=hive_20180829042323_49ab08f3-b5e3-45db-9ca5-94ce91d63b11); Time taken: 0.104 seconds
2018-08-29 04:23:46,584 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,594 INFO  org.apache.hadoop.hive.ql.log.PerfLogger: [HiveServer2-Background-Pool: Thread-138349]: </PERFLOG method=releaseLocks start=1535509426584 end=1535509426594 duration=10 from=org.apache.hadoop.hive.ql.Driver>
2018-08-29 04:23:46,595 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-138349]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
        at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
        at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
        at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
        at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Is it probably related to the HiveServer memory configuration? Do I understand it correctly that this fail happens inside the HiveServer2 JVM, when it tries to build a hash-table for a map-side join?

Thanks

1 ACCEPTED SOLUTION

avatar

Problem resolved: the issue was with the overall memory consumption on the HiveServer2. 

The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory. 

View solution in original post

1 REPLY 1

avatar

Problem resolved: the issue was with the overall memory consumption on the HiveServer2. 

The hive job submitted do not run in the HS2 JVM, but as an extra java process and the OS was killing it because it could not allocate such amount of memory.