- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Oozie (latest build) - Hive action - Looking for details on how oozie map task actually starts hive
- Labels:
-
Apache Oozie
Created on ‎05-18-2015 11:05 AM - edited ‎09-16-2022 02:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When Oozie runs a hive action it creates a launcher (map only task ) that runs on some worker node in the cluster. My understanding, is that map attempt will launch hive on that same machine, but I never see an instance of hive starting in the list of processes running on that machine. When a hive job is launched through templeteon, templeton will launch hive on the same machine its mapper was running and I can see see hive running in the process list. Hive will then become a client and start submitting whatever m/r jobs it needs. With Oozie, I never see an instance of hive spin up.
Are the hive components loaded in some other process ? (perhaps inproc with the jvm running the map instance ?)
I was not sure so hoping someone could clarify.
Thank you
Created ‎05-18-2015 02:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The way it works is the Hue launcher job acts like the Hive CLI and then spawns the Hive MR job. So you will never see an actual instance of hive spin up, you'll just see a launcher MR job and then a hive MR job.
Are you having an issue or just curious?
Created ‎05-18-2015 02:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The way it works is the Hue launcher job acts like the Hive CLI and then spawns the Hive MR job. So you will never see an actual instance of hive spin up, you'll just see a launcher MR job and then a hive MR job.
Are you having an issue or just curious?
Created ‎05-18-2015 02:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was mostly just curious so thank you for the reply.
I always thought Hue submitted Oozie jobs to Oozie's REST endpoint...
I was just curious to know where the hive client runs in the context of an oozie hive action since I never saw hive start on the workernode where oozie's launcher (map attempt) was executing.
No biggie. Thanks
Created ‎05-18-2015 02:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎05-18-2015 02:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I decided to do a quick check on the attempt processing one of oozie's hive actions:
org.apache.hadoop.mapred.YarnChild 10.0.0.9 33024 attempt_1431965085423_0015_m_000000_0 2
...and collected a quick stack trace which shows the hive client does indeed run inproc with the JVM running the attempt:
"main" prio=10 tid=0x00007f4dbc027800 nid=0xac2c in Object.wait() [0x00007f4dc5c2b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
at java.lang.Object.wait(Object.java:503)
at org.apache.hadoop.ipc.Client.call(Client.java:1454)
- locked <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy32.getJobReport(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:323)
- locked <0x00000000ff3bafd0> (a org.apache.hadoop.mapred.ClientServiceDelegate)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:422)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:575)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:603)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:601)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:601)
at org.apache.hadoop.mapred.JobClient.getJobInner(JobClient.java:611)
at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:636)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:288)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:435)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:443)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:459)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Thanks
