Member since
06-20-2016
488
Posts
433
Kudos Received
118
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3120 | 08-25-2017 03:09 PM | |
1974 | 08-22-2017 06:52 PM | |
3425 | 08-09-2017 01:10 PM | |
8093 | 08-04-2017 02:34 PM | |
8129 | 08-01-2017 11:35 AM |
12-05-2016
09:05 PM
This has a bit more of what you are looking for: http://www.slideshare.net/hortonworks/data-governance-atlas-7122015 But to get down to the technical details you seek ... you will have to go to the source code: https://github.com/apache/incubator-atlas
... View more
12-05-2016
08:24 PM
Is there ever a possibility where data in hdfs gets written to a master node log (e.g. via YARN, Oozie, Zookeeper) or other area of disk? Reason I am asking is because of strict security concerns of knowing everywhere that sensitive hdfs data may end up.
... View more
Labels:
- Labels:
-
Apache Hadoop
12-05-2016
01:21 PM
1 Kudo
This resource gives you an overview of atlas architecture (section 1.2) and restful apis (section 6) https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_data-governance/content/ch_hdp_data_governance_overview.html This gives you tutorials http://hortonworks.com/apache/atlas/#tutorials [UPDATE DEC 9] Here is an atlas user's guide that provides good details: http://atlas.incubator.apache.org/AtlasTechnicalUserGuide.pdf (Not sure if this is what you are referring to in your question)
... View more
12-05-2016
12:41 PM
1 Kudo
This will guide you through installing NiFi via Ambari (installs HDF including NiFi). https://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.0.1/bk_ambari-installation/content/ch_hdf-ambari-deployment.html
... View more
12-02-2016
03:06 PM
2 Kudos
There are a few ways to go about this. 1. Native monitoring Bulletin board shows all processors with WARNING and ERROR alerts and double click to actual processor. Status bar of course gives overall number of running and alerted and each processor gives metrics and history via Status History. https://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_HDF_GettingStarted/content/monitoring-nifi.html 2. Reporting Tasks (Push) Reporting Tasks run in the background to provide statistical reports about what is happening in the NiFi instance. It is configured in the UI (see link below) by accessing upper right dropdown and clicking Controller Settings, and then Reporting Tasks https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Reporting_Tasks There currently are 7 types of reporting tasks: AmbariReportingTask Publishes metrics from NiFi to Ambari Metrics Service (AMS). ControllerStatusReportingTask Logs the 5-minute stats that are shown in the NiFi Summary Page for Processors and Connections, as well optionally logging the deltas between the previous iteration and the current iteration.
DataDogReportingTask Publishes metrics from NiFi to datadog.
MonitorDiskUsage Checks the amount of storage space available for the specified directory and warns (via a log message and a System-Level Bulletin) if the partition on which it lives exceeds some configurable threshold of storage space MonitorMemory Checks the amount of Java Heap available in the JVM for a particular JVM Memory Pool. If the amount of space used exceeds some configurable threshold, will warn (via a log message and System-Level Bulletin) that the memory pool is exceeding this threshold. SiteToSiteProvenanceReportingTask Publishes Provenance events using the Site To Site protocol. StandardGangliaReporter Reports metrics to Ganglia so that Ganglia can be used for external monitoring of the application.
3. Rest API (Pull) Build your own monitoring via Rest API https://nifi.apache.org/docs/nifi-docs/rest-api/ 4. Misc MonitorActivity Processor: Indicate when flow has not recieved data after a specified period of time, and again when the activity is restored https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MonitorActivity/index.html Dedicated process groups to send alerts: http://ec2-52-89-10-2.us-west-2.compute.amazonaws.com/questions/54135/apache-nifi-how-to-productionize-dataflows-in-nifi.html Custom Logging:
https://community.hortonworks.com/articles/65027/nifi-easy-custom-logging-of-diverse-sources-in-mer.html
... View more
12-02-2016
01:05 PM
This explains flowfiles: [basic] http://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.1.1/bk_UserGuide/content/terminology.html [details/flowfile lifecycle] https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#DeeperView This is a good overview of NiFI: https://nifi.apache.org/docs/nifi-docs/html/getting-started.html. Each processor follows the same idea of flowfile = content + attributes. Following the links to each shows more specialized behavior for that processor in operating on flowfiles (which are passed from one processor to the next via connections)
... View more
12-02-2016
12:04 PM
Please note that the EvaluateJsonPath extracts from content and places to content or attributes (depending on whether you configure the Destination property as flowfile-content or flowfile-attribute) Thus it extracts from content and places to content or attributes. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.EvaluateJsonPath/index.html If you want to extract the headers from the response of InvokeHTTPRequest, you will use EvaluateJsonPath as described above, since the HTTP response header will be part of the json returned and thus part of the FlowFile content. The following post may be helpful: https://community.hortonworks.com/questions/21011/how-i-extract-attribute-from-json-file-using-nifi.html If this answers your question, let me know by accepting the answer; else, let me know of any gaps or follow-up questions.
... View more
12-02-2016
11:33 AM
Not sure specifically what you are looking for, but this shows the capabilities added for each version of atlas http://hortonworks.com/apache/atlas/#section_4
... View more
12-01-2016
04:00 PM
Same answer: since z2 is a bag, you need to flatten it to a tuple to do a distinct on it. For the data you are showing: z3 = for each z2 FLATTEN(BagToTuple($0)); z4 = distinct z3; The link gives the detailed explanation of why this is required.
... View more
12-01-2016
02:44 PM
That is a good question. I would post this as a separate question to get the full attention of nifi experts. I think you could simplify the question and the requirements by stating that you need to use nifi to call a java program that calls a stored procedure and put the results in hdfs. I believe this is possible, but there may be some points to consider.
... View more