Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.exe"

Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.exe"

New Contributor

I'm using Hive(with Yarn) that is installed by CDH-5.14.2-1, and made a database which keeps purchase history. One table which has purchase history has 1,000,000,000 tuples.

I tried the following query to measure Hive's performance.

 

SELECT c.gender, 
       g.NAME, 
       i.NAME, 
       Sum(b.num) 
FROM   customers c 
       JOIN boughts_bil b 
         ON ( c.id = b.cus_id 
              AND b.id < $var ) 
       JOIN items i 
         ON ( i.id = b.item_id ) 
       JOIN genres g 
         ON ( g.id = i.gen_id ) 
GROUP  BY c.gender, 
          g.NAME, 
          i.NAME; 

Incidentally, since I want to try with no optimization, I made no partitions.

 

When I set "$var=30,000,000", the error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.exe" has occurred. In reality, I use the same query and that time it worked fine.

 

Cloudera's plan was Express when it was going well, but now the plan became Enterprise-only. Is it cause?

Or are there different reasons for example out of memory error.

 

Please give your wisdom.

 

Thanks.

 

addition

I checked HistoryServer and write like below

 

Diagnostics: 
Application failed due to failed ApplicationMaster.
Only partial information is available; some values may be inaccurate.

I'll check the table value.

4 REPLIES 4
Highlighted

Re: Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.

Guru
Is it failed at MR side? We need to collect the YARN application logs and find out the exact message. Have you tried to run:

yarn logs -applicationId {application_id} -appOwner {username}

to collect the log and examine the output?

Re: Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.

New Contributor

Thanks for replying. And sorry that I miss clicked "accept as solution" .

 

I show a result of a run and a yarn log.

The run result is below

Query ID = ..._20180813111111_92d8a1f2-4614-49c6-8833-d7b2e709c79c
Total jobs = 2
Stage-1 is selected by condition resolver.
Launching Job 1 out of 2
Starting Job = job_1534123434864_0480, Tracking URL = http://...:8088/proxy/application_1534123434864_0480/
Kill Command = /.../hadoop job  -kill job_1534123434864_0480
Hadoop job information for Stage-1: number of mappers: 140; number of reducers: 557
2018-08-13 11:11:49,795 Stage-1 map = 0%,  reduce = 0%
... 2018-08-13 11:15:45,128 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 3475.74 sec MapReduce Total cumulative CPU time: 57 minutes 55 seconds 740 msec Ended Job = job_1534123434864_0480 Execution log at: /.../..._20180813111111_92d8a1f2-4614-49c6-8833-d7b2e709c79c.log 2018-08-13 11:15:51 Starting to launch local task to process map join; maximum memory = 1908932608 2018-08-13 11:15:52 Dump the side-table for tag: 1 with group count: 24 into file: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile01--.hashtable 2018-08-13 11:15:52 Uploaded 1 File to: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile01--.hashtable (902 bytes) 2018-08-13 11:15:52 Dump the side-table for tag: 1 with group count: 3500 into file: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile11--.hashtable 2018-08-13 11:15:52 Uploaded 1 File to: file:/.../c33533aa-7637-4034-a3d1-2e8b857c2820/hive_2018-08-13_11-11-38_070_2752807246292956243-1/-local-10006/HashTable-Stage-4/MapJoin-mapfile11--.hashtable (107794 bytes) 2018-08-13 11:15:52 End of local task; Time Taken: 1.54 sec. Execution completed successfully MapredLocal task succeeded Launching Job 2 out of 2 Number of reduce tasks not specified. Estimated from input data size: 1 Starting Job = job_1534123434864_0536, Tracking URL = http://...:8088/proxy/application_1534123434864_0536/ Kill Command = /.../hadoop job -kill job_1534123434864_0536 Hadoop job information for Stage-4: number of mappers: 4; number of reducers: 1 2018-08-13 11:16:23,048 Stage-4 map = 0%, reduce = 0% 2018-08-13 11:16:44,240 Stage-4 map = 25%, reduce = 0%, Cumulative CPU 2.28 sec 2018-08-13 11:16:46,330 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 5.06 sec 2018-08-13 11:16:49,473 Stage-4 map = 75%, reduce = 0%, Cumulative CPU 9.58 sec 2018-08-13 11:16:50,520 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 15.14 sec 2018-08-13 11:17:12,471 Stage-4 map = 0%, reduce = 0% 2018-08-13 11:17:42,680 Stage-4 map = 25%, reduce = 0%, Cumulative CPU 2.2 sec 2018-08-13 11:17:44,779 Stage-4 map = 50%, reduce = 0%, Cumulative CPU 5.25 sec 2018-08-13 11:17:46,873 Stage-4 map = 100%, reduce = 0%, Cumulative CPU 15.0 sec 2018-08-13 11:18:12,006 Stage-4 map = 0%, reduce = 0% MapReduce Total cumulative CPU time: 15 seconds 0 msec Ended Job = job_1534123434864_0536 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 140 Reduce: 557 Cumulative CPU: 3475.74 sec HDFS Read: 37355213704 HDFS Write: 56143 SUCCESS Stage-Stage-4: Map: 4 Reduce: 1 Cumulative CPU: 15.0 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 58 minutes 10 seconds 740 msec WARN: The method class org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked. WARN: Please see http://www.slf4j.org/codes.html#release for an explanation.

I checked 

yarn logs -applicationId application_1534123434864_0480

And there are some kinds of Errors in container_1534123434864_0480_02_000001

(1)ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: 
Container complete event for unknown container container_1534123434864_0480_02_000143


(2)INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1534123434864_0480_r_000014_1000: 
Container killed on request. Exit code is 137
Container exited with a non-zero exit code 137
Killed by external signal

(3)INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
Diagnostics report from attempt_1534123434864_0480_r_000041_1000:
Container exited with a non-zero exit code 154

(4)ERROR [ContainerLauncher #1] 
org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: 
Container launch failed for container_1534123434864_0480_02_000241 : 
java.io.IOException: Failed on local exception: java.io.IOException: java.io.IOException: 
Connection reset from partner; Host Details : local host is: "node3"; destination host is: "node2":8041; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
    at org.apache.hadoop.ipc.Client.call(Client.java:1508)
    at org.apache.hadoop.ipc.Client.call(Client.java:1441)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy40.startContainers(Unknown Source)
    at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
    at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
    at com.sun.proxy.$Proxy41.startContainers(Unknown Source)
    at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:155)
    at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:379)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.io.IOException: Connection reset from partner
    at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:718)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:681)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:769)
    at org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
    at org.apache.hadoop.ipc.Client.call(Client.java:1480)
    ... 15 more
Caused by: java.io.IOException: Connection reset from partner
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:197)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
    at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:370)
    at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:594)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:396)
    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:761)
    at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:757)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
    ... 18 more

 

Re: Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.

Guru
hmm, job job_1534123434864_0480 finished successfully, I think you should check the log for job job_1534123434864_0536 instead.

Re: Hive query stops with Error "Execution Error, return code 2 from org.apache.hadoop.hive.ql.

New Contributor

I couldn't find any error in job job_1534123434864_0536. So I uninstall cloudera manager and reinstall, then it works well.

Thanks for helping.