Member since
07-30-2019
333
Posts
355
Kudos Received
76
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3355 | 02-17-2017 10:58 PM | |
572 | 02-16-2017 07:55 PM | |
3069 | 12-21-2016 06:24 PM | |
413 | 12-20-2016 01:29 PM | |
326 | 12-16-2016 01:21 PM |
10-12-2016
02:11 PM
Hi Ankit, there is now official support available for NiFi as part of the HDF stack. Take a look here: http://hortonworks.com/products/data-center/hdf/ The previous demo service was an unsupported community effort which may not get further updates. Though, from what I know, the initial Ambari service and HDF support were driven by the same people. E.g. HDF will greatly simplify NiFi cluster security setup among other things.
... View more
10-06-2016
06:49 PM
1 Kudo
Short answer - kinda, it depends on your expectations of a scheduler. NiFi is perfectly capable of kicking off jobs once it prepares and lands the data. The nature of a scheduler, though, is to often wait for a job to finish, retry, act on it, etc. Depending on the actual infrastructure, you may find NiFi less convenient to handle such hierarchical dependencies than a scheduler that was designed for this purpose.
... View more
09-29-2016
03:59 PM
The number in the top-right corner of the processor is showing how many active threads there are executing for this component.
... View more
09-29-2016
03:58 PM
3 Kudos
Which Kafka version are you using? The timeout handling varied greatly between 0.8, 0.9 and 0.10, Match the consumer processor to the broker version, please (0.8 -> Put/Get/Kafka, 0.9 -> Publish/Consume, 0.10 -> Publish/Consume_0_10). You might be running into this issue: https://issues.apache.org/jira/browse/NIFI-2739 If you can't wait until NiFi 1.1 is released, this patch is already included in HDF 2.0
... View more
09-27-2016
08:38 PM
Please don't post to old threads which are done, create a new question. I will lock this one now.
... View more
09-27-2016
08:34 PM
Did you try start nifi as a root before maybe? I.e. permissions issue. Delete the $NIFI_HOME/work direction, clear the logs start as a nifi user. Tail the output of nifi-app.log.
... View more
09-27-2016
08:31 PM
3 Kudos
Hi Obaid, please take a look at https://community.hortonworks.com/articles/49467/integrating-apache-nifi-with-aws-s3-and-sqs.html The missing piece in your description is SQS.
... View more
09-26-2016
06:24 PM
Thanks Andy. I clearly understand the concern around security confidence levels, and don't put it out as a solution. Rather a workaround to let the devs move forward. This isn't an official solution by any means, and everyone should understand that in a thread.
... View more
09-26-2016
06:07 PM
1 Kudo
I have traced the root cause to be the low entropy on new VM instances, especially if they are headless (a typical cloud server today). To test if one is affected by the problem: head -1 /dev/urandom
If the above command doesn't return immediately with some garbage output, but rather hangs, your server is affected by the problem. Java's SecureRandom initializes by reading from /dev/urandom. Some solution online suggest modifying JCE settings to use /dev/random, but this is less desirable:
It's not guaranteed to always work There can be multiple JVMs, and admin may not always know which install is used by a specific process, and JAVA_HOME might not even be set, leaving him/her guessing It would require manual intervention, which hinders e.g. blueprints functionality (fully automated Ambari install) One solution which worked great for me and didn't require any JDK or code changes was to install the Haveged entropy daemon, which was designed for this problem specifically: http://www.issihosts.com/haveged/ Here's a process for CentOS 7. Similar steps are available for Ubuntu, etc. Haveged packages are in the EPEL repo. Need to install version-specific one for CentOS rpm -Uvh http://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
yum install -y haveged
chkconfig haveged on
service haveged start
After this my secure NiFI cluster reliably restarts within expected time windows.
... View more
09-26-2016
12:27 PM
Did you modify java heap size from defaults? Depending on your flow, you could simply run out of working memory. Bump up the -Xmx argument in bootstrap.conf and restart.
... View more
09-20-2016
01:01 PM
1 Kudo
I wonder if you're looking for MiNiFi?https://nifi.apache.org/minifi/index.html
... View more
09-14-2016
05:05 PM
2 Kudos
Hi Oliver. One would need to configure SSL context and add the self-signed certificate to the keystore used by it. Take a look at https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.ssl.StandardSSLContextService/index.html The components which support SSL will have a controller service property to reference. You would configure all SSL details and keystores in there, to be used by other processors.
... View more
09-09-2016
01:10 PM
1 Kudo
Sam, the statements should be in the flowfile content/payload. Please see https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hive.PutHiveQL/index.html Also, it's not limited to INSERT statements only, any Hive DDL will be executed by this processor as well. This becomes very useful for creating new partitions in a table, etc.
... View more
09-06-2016
12:50 PM
Hi, for now you would need to use the HDF stack which includes Ranger as well. Note, it's not yet GA, but everything points to a release very soon. Going forward, the focus will be on max reuse of components like Ranger and Ambari.
... View more
09-02-2016
11:27 AM
2 Kudos
Hi, I'm not yet sure if that can be done in JOLT spec. However, how about approaching it from the flow-based programming standpoint? The steps should be:
Transform with JOLT - do all, no filtering EvaluateJsonPath into an attribute - extract values that will be used to make a decision RouteOnAttribute - use an expression against the above set of attributes to implement the filtering logic. E.g. discard or re-route records which don't match the criteria.
... View more
08-31-2016
08:47 PM
Hi Vamsi, I'm trying to understand if the problem is with the maven build structure and module dependencies or with using those in NiFi UI (configuration)? In general it sounds like you're not doing anything unusual and it should be possible, but wanted to understand what you were trying to do first.
... View more
08-27-2016
08:43 PM
2 Kudos
Randy, there is a directory within NiFi under conf IIRC. However, this will not work as easily in a cluster. The right way is to download templates via a REST api, see what UI is doing under the hood.
... View more
08-25-2016
01:54 PM
2 Kudos
My favorite for this task would be a standard 'tr' unix command https://en.wikipedia.org/wiki/Tr_(Unix) Simpler than sed and you can invoke it using e.g. NiFi's ExecuteStreamCommand processor.
... View more
08-25-2016
01:52 PM
There's a better way, but not sure it's as easy to bring this data to Splunk. NiFi exposes statistics and metrics for all connections via REST API. For starters, take a look at what URL the Summary page invokes (you can read more in the REST api reference). Run a cli script to fetch and parse these if you wanted to monitor these, for example.
... View more
08-18-2016
03:10 PM
How do files arrive to HDP? Take a look at Apache NiFi (and HDF) for managing your data movement too.
... View more
08-17-2016
08:06 PM
2 Kudos
Hi, start by simply running it with defaults. Very often the client won't even have enough data generated for processing to warrant any changes from defaults. Next, if you see connections backlogging, change the number of concurrent instances for a specific processor (but bump by 1 only and re-test). Rinse and repeat. If you still need more, in NiFi Flow UI, in the global settings, you can increase the thread pool size available to the instance (might need to restart in this case, don't remember right now).
... View more
08-11-2016
01:30 PM
1 Kudo
How did you develop the custom processor? Have you used NiFi's maven-generated project structure? If yes, the build phase will have generated the NAR file which includes all dependencies and you drop it in NiFi's lib dir next. For external configuration artifacts for your processor, look into having a known location accessible by every node (e.g. http or nas), so the processor can read those. There's work under way around Variable Registry to provide a much better story as well.
... View more
08-07-2016
02:03 AM
1 Kudo
In general it doesn't sound like a problem at all. There's a lot of housekeeping threads NiFi has around for history, stats, journaling, expiration, etc. The RAM rise and fall is fine, this is the GC at work. You would have a problem if this zig zag pattern steadily creeps up to new highs, otherwise I wouldn't read too much into it.
... View more
08-03-2016
02:14 PM
One would need to traverse from the root group and down to your processor to find it (or use Search api to get the processor ref). In any case, some additional logic in your script if you wanted to avoid hardcoding the processor uuid. On another note, what are you trying to achieve? Can you do it by using built-in cron scheduling in processors? NiFi flows are generally designed to always run, not turned on and off like a workflow with dependencies.
... View more
08-02-2016
03:05 PM
3 Kudos
Yes, use a ReplaceText processor and NiFi's Expression Language syntax to reference your attributes and construct the content line. Tip: switch the evaluation mode for ReplaceText to 'Always Replace' as an optimization for your use case. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ReplaceText/index.html https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html
... View more
07-05-2016
12:26 PM
This service isn't official, and look like it's trying to download a very old NiFi version. Until an official Ambari service is released I recommend you simply download and install yourself, it's trivial (just change the port from the default 8080 in conf/nifi.properties to avoid conflicts).
... View more
07-03-2016
12:34 AM
It does sound like CSV files are created as an intermediary step for Phoenix bulk load. For HBase-Hive you had multiple good suggestions. For Kafka->Phoenix use NiFi and you can remove those additional steps.
... View more
06-30-2016
10:57 PM
JMS is designed to process individual messages. While there are some broker-specific settings to prefetch more, those are what they are, just proprietary switches.
... View more
06-29-2016
07:08 PM
Ok, I'm sure this set up is not currently supported by a set of SFTP processors. I'd suggest you file a jira issue here and see how much action we get: https://issues.apache.org/jira/browse/NIFI
... View more
06-25-2016
01:31 PM
1 Kudo
If you look into PutHDFS processor properties, you will see it prompts for a core-site.xml and hdfs-site.xml file locations. Copy those over to a NiFi node (or create a light client-side one only, e.g. download client configs zip from Ambari). NiFi already contains all Java libraries required to access HDFS.
... View more