Member since
07-30-2019
333
Posts
356
Kudos Received
76
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6439 | 02-17-2017 10:58 PM | |
1255 | 02-16-2017 07:55 PM | |
5326 | 12-21-2016 06:24 PM | |
980 | 12-20-2016 01:29 PM | |
651 | 12-16-2016 01:21 PM |
02-24-2017
04:28 PM
Hi Akash, Minifi runs as a small process on that server and sends data, e.g. to a larger NiFi cluster over the site-to-site protocol. The toolkit takes care of converting a template you create visually in NiFi into something that MiNiFi understands. How and what kind of data do you pull? It will define if you need to run the minifi process on that server or can just run it somewhere and have multiple servers send data to it.
... View more
02-17-2017
10:58 PM
1 Kudo
GetFile will delete the processed file by default. Check the Keep Source File setting, for example. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.GetFile/index.html A better alternative for processing files without touching them is a combination of ListFile/FetchFile processors. These will maintain the state internally in NiFi to track the processed files. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ListFile/index.html
... View more
02-16-2017
07:55 PM
3 Kudos
Hi Jobin, The backpressure should be treated as a threshold, not an absolute barrier. Due to scheduling semantics and some processors like e.g. UpdateAttribute doing internal micro-batching, the number of objects (or size) can go over the set limit. It will still engage the backpressure and will not turn it off until the level falls below the threshold, but in general you shouldn't worry too much about the number going slightly above. This is merely due to internal implementation details.
... View more
01-30-2017
05:12 PM
1 Kudo
Tim, It looks like the regex you're using might need more work to account for optional matches. The ReplaceText processor is complaining about missing match groups and has no context to do the replacement.
... View more
01-03-2017
04:06 PM
Make sure you split data using the SplitJson processor in NiFi before putting into Splunk. The reason is the syslog receiver may bundle incoming messages based on the network setup, but knows nothing about actual data format like json.
... View more
12-21-2016
06:49 PM
1 Kudo
You removed all details from the configuration. Do you have the 'temp' user? Which user do you use for SFTP login? Normally you wouldn't have access to other users' home directories. Try using a directory outside of /home and set correct permissions on it.
... View more
12-21-2016
06:24 PM
2 Kudos
Setting a remote file to /home/temp/ is wrong. It expects a filename, not a directory. A combination of ListSFTP and FetchSFTP will ensure the filenames are passed to the FestSFTP via a known attribute name. Here's an example template from the gallery: https://github.com/hortonworks-gallery/nifi-templates/blob/master/templates/List_and_Fetch_SFTP_template.xml
... View more
12-20-2016
01:29 PM
2 Kudos
Yes, use MonitorActivity processor: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MonitorActivity/index.html
... View more
12-16-2016
01:21 PM
1 Kudo
One needs to allow for a reflect function, it's blacklisted by default. See e.g. this discussion: https://community.hortonworks.com/questions/25828/udf-reflect-is-not-allowed-beeline.html
... View more
12-15-2016
06:52 PM
1 Kudo
ExecuteProcess doesn't take input. It is a source of data (e.g. from that process). What you're looking for is ExecuteStreamCommand, which allows for passing in a flowfile and evaluate attributes dynamically: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html
... View more
12-13-2016
08:51 PM
3 Kudos
David, it's here: http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.0/bk_dataflow-command-line-installation/content/hdf_isg_hardware.html Although I don't think guidelines have changed.
... View more
12-10-2016
03:06 PM
1 Kudo
A new flow file will be created, BUT they both will point to an immutable piece of data in the Content Repository. The HashContent step in your example will have replaced the content, but it will be a new FF pointing to a new piece of data in the content repository. The other branch of the flow is not affected in any way by this content change. Read more here, for example: https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#pass-by-reference
... View more
12-09-2016
01:53 PM
1 Kudo
Have you tried adding this 'set' statement as the first line of the query, terminated by a semicolon with newline? Next, try quoting the column name, too.
... View more
12-08-2016
01:13 PM
3 Kudos
It doesn't affect scheduling, it's a normal (failure) path for your data. When one auto-terminates a relationship it means that a FlowFile which was routed there is dropped and finishes its life in the flow (but still remains for some time in provenance/content repositories for history).
... View more
12-08-2016
01:07 PM
4 Kudos
Site-to-site is much more versatile than multi-DC communications (though this is a great use case for this NiFi's feature). S2s can link multiple clusters (or standalone instances), can even connect a cluster to itself (for data re-distribution), as well as used for MiNiFi to NiFi communication. It's also bi-directional, meaning it can be push/pull in either direction. At the end of the day this means you will be able to communicate e.g. over a corporate HTTP proxy regardless of the inbound/outbound firewall rules, there's enough flexibility to accommodate these scenarios in s2s. Read up more here https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
... View more
12-08-2016
01:02 PM
1 Kudo
The recommended setup for production is to use Kafka. NiFi publishes to Kafka, Spark Streaming consumes from the topic (or the reverse). Spark Receiver in NiFi works, but wasn't tested at production scale.
... View more
12-08-2016
01:00 PM
Avijeet, take a look at NiFi deeper architectural documents, I recommend https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
... View more
12-08-2016
12:58 PM
This was a community project of mine. It's not part of NiFi. However, it did validate a few very important use cases and allowed to collect real-world usage patterns. There's a longer-term effort in-flight around SDLC which involves big pieces. E.g. take a look at https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows and https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#custom_properties
... View more
12-05-2016
12:42 PM
1 Kudo
Ali, take a look at HDF http://hortonworks.com/products/data-center/hdf/ Now that NiFi 1.1.0 is out, the updated HDF version which includes it is around the corner. By the way, NiFi 1.x changed the clustering model, there's no more NCM. More on the architecture here: https://nifi.apache.org/docs/nifi-docs/html/overview.html#nifi-architecture
... View more
12-01-2016
10:01 PM
1 Kudo
Ok, it looks like an environmental issue with random entropy collection. There are several ways to solve it, pick your choice based on prod/non-prod requirements. There was a previous discussion with some suggestions here: https://community.hortonworks.com/questions/58436/hdf-20-handing-on-restart.html
... View more
12-01-2016
01:29 PM
1 Kudo
Hi Avijeet. This is currently not possible, but do take a look at the following proposal to understand where things are going in the future: https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows
... View more
11-29-2016
09:29 PM
1 Kudo
Please check in the nifi-app.log file, not bootstrap. Most probably there is a service in your host already taking the 8080 port. You can edit conf/nifi.properties and modify the port to a non-conflicting one.
... View more
11-22-2016
05:11 PM
1 Kudo
Depending on the format of your data, you might be better off using the ScanAttribute https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ScanAttribute/index.html It will reference an external file for a match and route the flowfile accordingly. E.g. follow up with an UpdateAttribute to update/add to the metadata.
... View more
11-22-2016
01:13 PM
Let's simplify the config to eliminate misconfiguration in other areas. If you want A to initiate the connection, you can remove those nifi.remote.* properties on the A side. Next, the port 8443 is not the best choice, as it often is used for https on non-privileged port. It's not even required now, as NiFi's s2s can tunnel over HTTP. Remove input.socket.port on the B side. Finally, please share which URL you used in the RPG setup (the UI on the A side). The GoTo action simply opens a new browser window. What we are looking for is being able to Refresh the RPG and see ports after 20-30 seconds after initial setup. Few more things to validate. I understand this is not a secured instance, so port access permissions aren't in play. Make sure, though, that instance B has input/output ports added the root top-level processing group, this is a requirements for the s2s.
... View more
11-15-2016
09:21 PM
I would be very cautious about that statement. In a highly available environment Kafka brokers don't just come and go. They are part of a replication group and persisted topic data must follow or be replicated to another host if a node is gone.
... View more
11-14-2016
01:05 PM
2 Kudos
You can specify a list of brokers. The main difference in Kafka from before was client-managed offsets, i.e. ZK was no longer on the read path for a consumer. Also, why would you expect brokers to keep changing?
... View more
11-09-2016
04:36 PM
1 Kudo
Hi Obaid, currently HDF installs its own ZK quorum. In the future colocating HDF & HDP in the same cluster will be supported through a single Ambari instance to better reuse infrastructure components. http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.0.1/bk_ambari-installation/content/ch_installing-ambari.html
... View more
11-09-2016
12:21 AM
1 Kudo
Hey Andrew, If your message is a valid JSON (it seems to be), then a parser is able to read it an decode a literal string. The next trick is to put this results into a FlowFile content/body and run another EvaluateJsonPath/JoltTransformer chain. I did a quick experiment here, and it seems to work fine.
... View more
11-08-2016
11:19 PM
1 Kudo
Scott, Use ExecuteScript processor, as mentioned in the article. Looks like you are putting an embedded script into ExecuteProcess, which is meant to invoke OS shell commands, hence the error.
... View more