Support Questions

Find answers, ask questions, and share your expertise

Capture output from Hive action and use that as input for shell action?

avatar
Rising Star

Hi all,

I was hoping someone might be able to detail whether what I am attempting to do is currently possible in Oozie and if so how it could be done. I have seen many sources about getting an output from a shell action and inputting it into a Hive action I have however not seen much on whether this can be done the other way around.

So my issue is that I would like to run a hive action which will capture the most recent field in a table based off of the max timestamp. I would then like to pass this timestamp value over to a shell action which will take this value and put it in the where statement for a Sqoop Extract. How would I go about passing this value from the Hive action to the Shell action?

Is this possible?

Please let me know if you need any additional information, thanks in advance.

1 ACCEPTED SOLUTION

avatar
Master Guru

You can also run the hive script in a shell/ssh action, parse the output using your shell script and output some parameters that you then use in your oozie flow ( see my answer to that question )

https://community.hortonworks.com/questions/24182/where-is-the-output-of-an-oozie-workflow-is-stored...

View solution in original post

4 REPLIES 4

avatar
Master Guru

@Daniel Perry - I can think of easiest solution, if you can save output of hive action to some file then you can pass that file to shell script as argument, write a login in your shell script to take $1 as input file and do whatever you want.

Does this makes sense?

avatar
Master Guru

You can also run the hive script in a shell/ssh action, parse the output using your shell script and output some parameters that you then use in your oozie flow ( see my answer to that question )

https://community.hortonworks.com/questions/24182/where-is-the-output-of-an-oozie-workflow-is-stored...

avatar
Rising Star

I ended up going with your approach Ben as it suited what I was trying to do a bit better and after much fiddling around I managed to get it working. However, I am getting the value from my query back like this

lastModified=+------------------------+--+ | 2016-03-31 21:59:57.0 | +------------------------+--+

Whereas all I really want is the date value not the extra jargon, is this something i can use regex for to get rid of?

Thanks

avatar
Master Guru

@Daniel Perry

The below is the regex to get the date out there:

[0-9]+(.*)[0-9]+

Now how to do that? Below is some pseudo code on how I once did something similar. Its a lot of trickery with the different exclamation marks ( notice how the `` is used to execute a command and the "" to denotate a string. You also have to escape characters a lot. So its not finished. You may have to play around with it a bit. But it should work.

myvar=+----..........2016 .... ;

datevar="`echo "$myvar" | sed 'myregex'`"

The second possibility is to do that in oozie. Oozie supports the JSTL expression language functions and you could just use substring since the length of the before and after string always seems to be the same?

http://beginnersbook.com/2013/12/jstl-substring-substringafter-substringbefore-functions/