Created on 05-18-2015 11:05 AM - edited 09-16-2022 02:29 AM
When Oozie runs a hive action it creates a launcher (map only task ) that runs on some worker node in the cluster. My understanding, is that map attempt will launch hive on that same machine, but I never see an instance of hive starting in the list of processes running on that machine. When a hive job is launched through templeteon, templeton will launch hive on the same machine its mapper was running and I can see see hive running in the process list. Hive will then become a client and start submitting whatever m/r jobs it needs. With Oozie, I never see an instance of hive spin up.
Are the hive components loaded in some other process ? (perhaps inproc with the jvm running the map instance ?)
I was not sure so hoping someone could clarify.
Thank you
Created 05-18-2015 02:04 PM
Created 05-18-2015 02:04 PM
Created 05-18-2015 02:13 PM
I was mostly just curious so thank you for the reply.
I always thought Hue submitted Oozie jobs to Oozie's REST endpoint...
I was just curious to know where the hive client runs in the context of an oozie hive action since I never saw hive start on the workernode where oozie's launcher (map attempt) was executing.
No biggie. Thanks
Created 05-18-2015 02:14 PM
Created 05-18-2015 02:24 PM
I decided to do a quick check on the attempt processing one of oozie's hive actions:
org.apache.hadoop.mapred.YarnChild 10.0.0.9 33024 attempt_1431965085423_0015_m_000000_0 2
...and collected a quick stack trace which shows the hive client does indeed run inproc with the JVM running the attempt:
"main" prio=10 tid=0x00007f4dbc027800 nid=0xac2c in Object.wait() [0x00007f4dc5c2b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1454)
- locked <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy32.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:323)
- locked <0x00000000ff3bafd0> (a org.apache.hadoop.mapred.ClientServiceDelegate)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:422)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:575)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:603)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:601)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:601)
at org.apache.hadoop.mapred.JobClient.getJobInner(JobClient.java:611)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:636)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:288)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:435)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:443)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:459)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Thanks