Member since
10-06-2015
273
Posts
202
Kudos Received
81
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4044 | 10-11-2017 09:33 PM | |
3566 | 10-11-2017 07:46 PM | |
2571 | 08-04-2017 01:37 PM | |
2215 | 08-03-2017 03:36 PM | |
2242 | 08-03-2017 12:52 PM |
05-23-2017
05:20 PM
@regie canada As Matt suggested below, use the SplitContent processor to split the file into multiple, smaller flow files. The "byte sequence" entry for splitting would be ####################################################################### START of Request ####################################################################### After that, use the ExtractText processor, as described in my response above, to get the first 4 lines of each flow file generated by the SplitContent processor.
... View more
05-23-2017
02:03 PM
Take a look at the "Create Lineage amongst data sets" section (p. 46) in the document link I shared above. It also has a detailed example.
... View more
05-22-2017
01:44 PM
2 Kudos
@Dinesh Das Atlas does not currently provide out-of-box integration with components outside of the HDP stack. You can, however, create your own entities and use the REST API (or Kafka messages) to populate them. This, of course, will require some scripting/coding from you to push the data via REST. Here is some documentation and examples: http://atlas.apache.org/0.7.0-incubating/AtlasTechnicalUserGuide.pdf Please note that while this documentation also applies to Atlas 0.7-0.8 (in HDP 2.5-2.6), it does use APIs that have been deprecated in the latest version (HDP 2.6) and will be removed in future ones. Still, it's good to get you started with your implementation.
... View more
05-22-2017
01:38 PM
1 Kudo
@vnandigam You are correct, Atlas does not currently provide lineage for Spark. This is something engineering/community is working on. You can, however, create your own entities and use the REST API to populate them. Here is some documentation and examples: http://atlas.apache.org/0.7.0-incubating/AtlasTechnicalUserGuide.pdf Please note that while this documentation also applies to Atlas 0.7-0.8 (in HDP 2.5-2.6), it does use APIs that have been deprecated in that version and will be removed n future ones. Still, it's good to get you started with your implementation. As always, if you find any responses here useful, don't forget to "accept" an answer.
... View more
05-19-2017
07:02 PM
4 Kudos
@regie canada You can use the ExtractText processor and use regex within it to pull the first 4 lines. Your regex would be: (.*)\n(.*)\n(.*)\n(.*) After that you can use the SplitText processor if you want each line to be an individual flowfile or you can use the UpdateAttribute processor to make any kind of transformations on the 4 lines.
... View more
05-19-2017
06:34 PM
@Robin Dong Hi Robin, Can you please close/delete this question since it's a duplicate of https://community.hortonworks.com/questions/103733/anyone-have-info-on-how-mongodb-do-the-sharding-wi.html Thanks
... View more
05-19-2017
04:11 PM
1 Kudo
@Robin Dong What repo did you use to install MongoDB? Have you seen this one:https://github.com/geniuszhe/ambari-mongodb-cluster
... View more
05-18-2017
06:08 PM
@Farzaneh Poorjabar You need to enable the"Recursive" toggle for the policy to apply to child folders.
... View more
05-18-2017
03:37 PM
1 Kudo
I have a Nifi cluster consuming messages from Kafka and sending the output to a PostgresDB. How do I guarantee that each message will be consumed/processed exactly once?
... View more
Labels:
- Labels:
-
Apache NiFi
05-17-2017
08:46 PM
Thanks @mqureshi . I didn't realize ExecuteSQL used a connection pool.
... View more