Member since
07-01-2016
5
Posts
0
Kudos Received
0
Solutions
06-06-2017
05:43 AM
Duplicate of https://stackoverflow.com/questions/44154474/how-to-catch-oozie-spark-output/ and also of https://stackoverflow.com/questions/44171000/capture-console-output-of-spark-action-node-in-oozie-as-variable-across-the-oozi
... View more
06-06-2017
05:38 AM
The logs of the YARN services (RM, NM) are irrelevant. You must inspect the logs of the YARN job in HistoryServer. Note that job_1496499143480_0003 uses the legacy naming convention (pre-YARN); the actual YARN job ID is application_1496499143480_0003 Option 1: open the YARN UI, and inspect the dashboard of recent jobs for that ID Option 2: use the CLI i.e. yarn application -status application_1496499143480_0003 yarn logs -applicationId application_1496499143480_0003 | more
... View more
04-18-2017
08:10 AM
Oozie has a control structure, named "Fork Join", to run multiple Actions in parallel. Looks like it's exactly what you need (provided the number of Actions is fixed and immutable, and the arguments are hard-coded in the Workflow). Look into that "Hooked for Hadoop" tutorial for example, section 5.0. Fork-Join controls http://hadooped.blogspot.com/2013/07/apache-oozie-part-9a-coordinator-jobs.html
... View more