Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Oozie (latest build) - Hive action - Looking for details on how oozie map task actually starts hive

avatar
New Contributor

When Oozie runs a hive action it creates a launcher (map only task ) that runs on some worker node in the cluster.  My understanding, is that map attempt will launch hive on that same machine, but I never see an instance of hive starting in the list of processes running on that machine.  When a hive job is launched through templeteon, templeton will launch hive on the same machine its mapper was running and I can see see hive running in the process list. Hive will then become a client and start submitting whatever m/r jobs it needs.   With Oozie, I never see an instance of hive spin up.

 

Are the hive components loaded in some other process ?  (perhaps inproc with the jvm running the map instance ?)

I was not sure so hoping someone could clarify.

 

Thank you

 

1 ACCEPTED SOLUTION

avatar
Super Collaborator
Hey,

The way it works is the Hue launcher job acts like the Hive CLI and then spawns the Hive MR job. So you will never see an actual instance of hive spin up, you'll just see a launcher MR job and then a hive MR job.

Are you having an issue or just curious?

View solution in original post

4 REPLIES 4

avatar
Super Collaborator
Hey,

The way it works is the Hue launcher job acts like the Hive CLI and then spawns the Hive MR job. So you will never see an actual instance of hive spin up, you'll just see a launcher MR job and then a hive MR job.

Are you having an issue or just curious?

avatar
New Contributor

I was mostly just curious so thank you for the reply.

I always thought Hue submitted Oozie jobs to Oozie's REST endpoint...

 

I was just curious to know where the hive client runs in the context of an oozie hive action since I never saw hive start on the workernode where oozie's launcher (map attempt) was executing.

 

No biggie.   Thanks

 

 

avatar
Super Collaborator
Great, just wanted to make sure we took care of any problems you might have been running into:-). Did that answer your question?

avatar
New Contributor

I decided to do a quick check on the attempt processing one of oozie's hive actions:

 

org.apache.hadoop.mapred.YarnChild 10.0.0.9 33024 attempt_1431965085423_0015_m_000000_0 2

 

...and collected a quick stack trace which shows the hive client does indeed run inproc with the JVM running the attempt:

 

"main" prio=10 tid=0x00007f4dbc027800 nid=0xac2c in Object.wait() [0x00007f4dc5c2b000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
    at java.lang.Object.wait(Object.java:503)
    at org.apache.hadoop.ipc.Client.call(Client.java:1454)
    - locked <0x00000000fe43fa48> (a org.apache.hadoop.ipc.Client$Call)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy32.getJobReport(Unknown Source)
    at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:323)
    - locked <0x00000000ff3bafd0> (a org.apache.hadoop.mapred.ClientServiceDelegate)
    at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:422)
    at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:575)
    at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:603)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:601)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:601)
    at org.apache.hadoop.mapred.JobClient.getJobInner(JobClient.java:611)
    at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:636)
    at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:288)
    at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:547)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:435)
    at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:345)
    at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:443)
    at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:459)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
    at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:323)
    at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:284)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
    at org.apache.oozie.action.hadoop.HiveMain.main(HiveMain.java:66)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

Thanks