Member since
07-30-2019
333
Posts
355
Kudos Received
76
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3047 | 02-17-2017 10:58 PM | |
540 | 02-16-2017 07:55 PM | |
2777 | 12-21-2016 06:24 PM | |
379 | 12-20-2016 01:29 PM | |
293 | 12-16-2016 01:21 PM |
02-24-2017
04:28 PM
Hi Akash, Minifi runs as a small process on that server and sends data, e.g. to a larger NiFi cluster over the site-to-site protocol. The toolkit takes care of converting a template you create visually in NiFi into something that MiNiFi understands. How and what kind of data do you pull? It will define if you need to run the minifi process on that server or can just run it somewhere and have multiple servers send data to it.
... View more
02-17-2017
10:58 PM
1 Kudo
GetFile will delete the processed file by default. Check the Keep Source File setting, for example. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.GetFile/index.html A better alternative for processing files without touching them is a combination of ListFile/FetchFile processors. These will maintain the state internally in NiFi to track the processed files. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ListFile/index.html
... View more
02-16-2017
07:55 PM
3 Kudos
Hi Jobin, The backpressure should be treated as a threshold, not an absolute barrier. Due to scheduling semantics and some processors like e.g. UpdateAttribute doing internal micro-batching, the number of objects (or size) can go over the set limit. It will still engage the backpressure and will not turn it off until the level falls below the threshold, but in general you shouldn't worry too much about the number going slightly above. This is merely due to internal implementation details.
... View more
01-30-2017
05:12 PM
1 Kudo
Tim, It looks like the regex you're using might need more work to account for optional matches. The ReplaceText processor is complaining about missing match groups and has no context to do the replacement.
... View more
01-03-2017
04:06 PM
Make sure you split data using the SplitJson processor in NiFi before putting into Splunk. The reason is the syslog receiver may bundle incoming messages based on the network setup, but knows nothing about actual data format like json.
... View more
01-03-2017
03:55 PM
1 Kudo
Take a look at hortonworks.com/products/data-center/hdf/
... View more
12-30-2016
02:53 PM
Did you configure your client browser to use Kerberos? Have you kinit'ed on the client side successfully?
... View more
12-29-2016
02:26 PM
Is the original problem with the 'facility' string solved? Can you accept an answer? From your logs, now you are running out of JVM memory. Increase the heap size as per https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#bootstrap_properties
... View more
12-28-2016
04:44 PM
Your facility string is incorrect. A list of permissible values is available here: http://logback.qos.ch/manual/appenders.html Additionally check the syslog receiver port on the other end (default is 514).
... View more
12-28-2016
04:26 PM
Please check output of the nifi-bootstrap and nifi-app log files. It will tell if there were any errors in the logging configuration. In addition, it's not required to restart NiFi when one changes logback.xml - updates will be picked up and reloaded automatically every 30 seconds.
... View more
12-21-2016
06:49 PM
1 Kudo
You removed all details from the configuration. Do you have the 'temp' user? Which user do you use for SFTP login? Normally you wouldn't have access to other users' home directories. Try using a directory outside of /home and set correct permissions on it.
... View more
12-21-2016
06:24 PM
2 Kudos
Setting a remote file to /home/temp/ is wrong. It expects a filename, not a directory. A combination of ListSFTP and FetchSFTP will ensure the filenames are passed to the FestSFTP via a known attribute name. Here's an example template from the gallery: https://github.com/hortonworks-gallery/nifi-templates/blob/master/templates/List_and_Fetch_SFTP_template.xml
... View more
12-20-2016
01:29 PM
2 Kudos
Yes, use MonitorActivity processor: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MonitorActivity/index.html
... View more
12-19-2016
01:26 PM
1 Kudo
{} is not an array, but just an empty JSON object. The array would be encoded as []. Is your incoming object always an array?
... View more
12-16-2016
01:21 PM
1 Kudo
One needs to allow for a reflect function, it's blacklisted by default. See e.g. this discussion: https://community.hortonworks.com/questions/25828/udf-reflect-is-not-allowed-beeline.html
... View more
12-16-2016
01:18 PM
Is there a backpressure configured on your connection? It could shut off the flow temporarily until the backlog drops below a set threshold.
... View more
12-15-2016
06:52 PM
1 Kudo
ExecuteProcess doesn't take input. It is a source of data (e.g. from that process). What you're looking for is ExecuteStreamCommand, which allows for passing in a flowfile and evaluate attributes dynamically: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html
... View more
12-13-2016
08:51 PM
3 Kudos
David, it's here: http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.0/bk_dataflow-command-line-installation/content/hdf_isg_hardware.html Although I don't think guidelines have changed.
... View more
12-10-2016
03:06 PM
1 Kudo
A new flow file will be created, BUT they both will point to an immutable piece of data in the Content Repository. The HashContent step in your example will have replaced the content, but it will be a new FF pointing to a new piece of data in the content repository. The other branch of the flow is not affected in any way by this content change. Read more here, for example: https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#pass-by-reference
... View more
12-10-2016
02:42 PM
1 Kudo
Kumar, the LogAttribute processor is just that, a simple log. If you need any customized handling of attributes, formats, data redacting, etc, tap into a stream and process it as regular dataflow.
... View more
12-09-2016
01:53 PM
1 Kudo
Have you tried adding this 'set' statement as the first line of the query, terminated by a semicolon with newline? Next, try quoting the column name, too.
... View more
12-08-2016
01:13 PM
3 Kudos
It doesn't affect scheduling, it's a normal (failure) path for your data. When one auto-terminates a relationship it means that a FlowFile which was routed there is dropped and finishes its life in the flow (but still remains for some time in provenance/content repositories for history).
... View more
12-08-2016
01:07 PM
4 Kudos
Site-to-site is much more versatile than multi-DC communications (though this is a great use case for this NiFi's feature). S2s can link multiple clusters (or standalone instances), can even connect a cluster to itself (for data re-distribution), as well as used for MiNiFi to NiFi communication. It's also bi-directional, meaning it can be push/pull in either direction. At the end of the day this means you will be able to communicate e.g. over a corporate HTTP proxy regardless of the inbound/outbound firewall rules, there's enough flexibility to accommodate these scenarios in s2s. Read up more here https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
... View more
12-08-2016
01:02 PM
1 Kudo
The recommended setup for production is to use Kafka. NiFi publishes to Kafka, Spark Streaming consumes from the topic (or the reverse). Spark Receiver in NiFi works, but wasn't tested at production scale.
... View more
12-08-2016
01:00 PM
Avijeet, take a look at NiFi deeper architectural documents, I recommend https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html
... View more
12-08-2016
12:58 PM
This was a community project of mine. It's not part of NiFi. However, it did validate a few very important use cases and allowed to collect real-world usage patterns. There's a longer-term effort in-flight around SDLC which involves big pieces. E.g. take a look at https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows and https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#custom_properties
... View more
12-08-2016
12:49 PM
3 Kudos
Sure. Make sure the Spark cli dependencies are available on every node, i.e. you are able to submit your spark job from any node in the NiFi cluster. Next, assuming you'd like to submit the job only once within a cluster, configure ExecuteStreamCommand by going in its Scheduling tab and selecting On Primary Node in the strategy dropdown. This will ensure it is a cluster-wide singleton. Note that you can't pin the primary node for failover reasons, e.g. this is a role automatically voted by a cluster and may change through its lifecycle if there's a recovery event, etc.
... View more
12-05-2016
12:42 PM
1 Kudo
Ali, take a look at HDF http://hortonworks.com/products/data-center/hdf/ Now that NiFi 1.1.0 is out, the updated HDF version which includes it is around the corner. By the way, NiFi 1.x changed the clustering model, there's no more NCM. More on the architecture here: https://nifi.apache.org/docs/nifi-docs/html/overview.html#nifi-architecture
... View more
12-01-2016
10:01 PM
1 Kudo
Ok, it looks like an environmental issue with random entropy collection. There are several ways to solve it, pick your choice based on prod/non-prod requirements. There was a previous discussion with some suggestions here: https://community.hortonworks.com/questions/58436/hdf-20-handing-on-restart.html
... View more