Member since
09-24-2015
178
Posts
113
Kudos Received
28
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3366 | 05-25-2016 02:39 AM | |
3577 | 05-03-2016 01:27 PM | |
837 | 04-26-2016 07:59 PM | |
14347 | 03-24-2016 04:10 PM | |
2012 | 02-02-2016 11:50 PM |
04-26-2016
07:59 PM
@Sunile Manjee The dependencies are derived based on the entity description, once you create those entities using Falcon (UI or CLI). So for e.g., you define your cluster in the cluster entity xml, you specify the name.. <cluster colo="location1" description="primaryDemoCluster" name="primaryCluster" xmlns="uri:falcon:cluster:0.1"> When you define this cluster in a feed entity, the dependency gets created when you create the feed entity.. <feed description="Demo Input Data" name="demoEventData" xmlns="uri:falcon:feed:0.1">
<tags>externalSystem=eventData,classification=clinicalResearch</tags>
<groups>events</groups>
<frequency>minutes(3)</frequency>
<timezone>GMT+00:00</timezone>
<late-arrival cut-off="hours(4)"/>
<clusters>
<cluster name="primaryCluster" type="source">
<validity start="2015-08-10T08:00Z" end="2016-02-08T22:00Z"/>
<retention limit="days(5)" action="delete"/>
</cluster>
</clusters>
The same concept applies to processes to feed dependencies.. Take a look at this example for working set of falcon entities - https://github.com/sainib/hadoop-data-pipeline/tree/master/falcon
... View more
04-13-2016
05:30 PM
Try the same command after performing kinit - kinit -kt <PATH_TO_KEYTAB> <YOUR-PRINCIPAL-ID>
... View more
03-24-2016
04:10 PM
1 Kudo
@Alex Raj
So it appears your calling a Shell action which is expected to produce some output (within file system or hdfs) and you want to see that, is that correct? Or Are you actually wanting to capture the output (echo statements) in the script for the purpose of referencing those values in subsequent steps in the Oozie workflow?
If its the latter, see the response from @Benjamin Leonhardi
If its the former, which I believe you are asking then the answer is (you wont be thrilled) - It depends.
It depends on what the script is doing. I can imagine few scenarios and will talk through that but let us know if you are doing something different in which case, we can talk specific about that. So here is what you MAY be doing in the script - writing to a local file with absolute path writing to a local file with relative path writing to a HDFS file with absolute path Writing to a local file with absolute path -
Lets say the script does this - touch /tmp/a.txt In this case, the output gets created on the local filesystem of nodemanager where the task got executed. There is really no way to tell which one.. so you would have to check all nodes. The good thing is that you know what the absolute path is. Writing to a local file with relative path - Lets say the script does this - touch ./a.txt In this case, the output gets created on the local filesystem of nodemanager, where the task got executed, but relative to the working temp directory where workflow temporary files are created. There is really no way to tell which note and we may never even see the actual file because usually the temporary files are cleaned up after the workflow is executed. SO if the file is within the subdirectory then it will most likely be deleted.
Writing to a HDFS file with absolute path <- This is the best way to setup the program because you know where to look for output. Lets say the script does this -
echo "my content" >> /tmp/a.txt
hdfs dfs -put /tmp/a.txt /tmp/a.txt In this case, the output gets created on HDFS & you know the path. So its easy to find. If you are not following the last approach, I would recommend that. Hope this helps.
... View more
02-12-2016
12:10 AM
1 Kudo
Is it possible to have multiple NFS Gateways on different nodes on a single cluster?
... View more
Labels:
- Labels:
-
Apache Hadoop
02-02-2016
11:50 PM
1 Kudo
@khushi kalra Like Neeraj and Artem pointed out, Apache Atlas is the right tool for managing metadata for Hadoop. Falcon is more for managing the data pipeline and data workflow management which is big part of overall data governance but not metadata. In addition to the links and resources provided, here is a Apache Atlas presentation video by Governance product manager, Andrew Ahn.. https://www.youtube.com/watch?v=LZR4qhKJeSI
... View more
01-22-2016
08:36 PM
@Balu Back to the original error.
... View more
01-20-2016
04:23 AM
@Balu
Some more updates - Tried couple of things but still having issue A) Tried changing dataIn=org.apache.oozie.extensions.OozieELExtensions#ph3_dataIn, =org.apache.oozie.coord.CoordELFunctions#ph3_coord_nominalTime, To dataIn=org.apache.oozie.extensions.OozieELExtensions#ph3_dataIn, nominalTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_nominalTime, B) Also added dataIn to the "oozie.service.ELService.ext.functions.coord-job-submit-instances" Still getting error --- "Caused by: E1004 : E1004: Expression language evaluation error, Unable to evaluate :${dataIn('eventData', 'null')}:"
... View more
01-20-2016
04:13 AM
@Balu Does this look okay to you? This is from the documentation - http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/configuring_oozie_for_falcon.html
... View more
01-20-2016
04:10 AM
@Balu Following 3 properties were missing in the sandbox, that I think should be there because we dont want folks using Sandbox to get stuck with this issue. However the issue is not yet resolved.. Previously, I was misisng the function "now" which got added with the properties detailed at that link but now I am missing another function - dataIn Missing Properties New Exception - Caused by: E1004 : E1004: Expression language evaluation error, Unable to evaluate :${dataIn('eventData', 'null')}:
at org.apache.oozie.client.OozieClient.handleError(OozieClient.java:612)
... View more
01-20-2016
03:48 AM
I agree that Coord El function props are missing but I wasnt sure about the steps to add those. Let me try http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_installing_manually_book/content/configuring_oozie_for_falcon.html and will update this thread.
... View more