Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3348 | 12-03-2018 02:26 PM | |
2302 | 10-16-2018 01:37 PM | |
3615 | 10-03-2018 06:34 PM | |
2392 | 09-05-2018 07:44 PM | |
1814 | 09-05-2018 07:31 PM |
03-22-2016
08:30 PM
2 Kudos
I just tried this and realized I mis-spoke about setting "nifi.remote.input.secure" to false on the https instance and not needing the certs on the http instance. The reason why is because the http instance still needs to connect to the https instance initially to ask for the value of "nifi.remote.input.port". So even though the resulting site-to-site connection would be unsecure, the initial connection still has to be secure. Here is my best attempt to recreate the steps I just followed that ended up working... I created two copies of nifi-0.6.0-SNAPSHOT in a directory and called one nifi-https and one nifi-http. On nifi-https I configured the following properties in nifi.properties (everything else left as defaults): nifi.remote.input.socket.host=hostname.from.my.cert
nifi.remote.input.socket.port=8899
nifi.remote.input.secure=true
nifi.web.http.port=
nifi.web.https.port=8443
nifi.security.keystore=mycert.p12
nifi.security.keystoreType=PKCS12
nifi.security.keystorePasswd=mypassword
nifi.security.truststore=mytruststore.jks
nifi.security.truststoreType=JKS
nifi.security.truststorePasswd=mypassword That makes nifi-https a secured instance with the web UI running on 8443 and a secure site-to-site connect available on 8899. Now I went to https://localhost:8443/nifi in my browser and got prompted to request an account. At this point I edited nifi.properties again to turn on regular http access by setting nifi.web.http.port=8080, restarted nifi, went to http://localhost:8080/nifi and approved my account, then removed http access and restarted again, and was able to access over https. I then created a an Output Port called "Test" with a GenerateFlowFile sending data to it. At this point nifi-https is fully setup. On nifi-http I configured the following properties in nifi.properties (everything else left as defaults): nifi.security.keystore=mycert.p12
nifi.security.keystoreType=PKCS12
nifi.security.keystorePasswd=mypassword
nifi.security.truststore=mytruststore.jks
nifi.security.truststoreType=JKS
nifi.security.truststorePasswd=mypassword
That makes nifi-http a regular unsecured instance running on port 8080, but it now has the cert and truststore to make outbound secure connections. Now I went to http://localhost:8080/nifi and create a Remote Process Group (RPG) with a URL of https://hostname.from.my.cert:8443/nifi. It is import that the hostname in this URL matches the value of "nifi.remote.input.socket.host" from the nifi-https instance. Now I right-clicked on the RPG and chose Enable Transmission at which point I got a message that an account was requested. This happened because nifi-http is using mycert.p12 to connect to nifi-https, but nifi-https does not have an approved account for mycert.p12. So I went to nifi-https (https://localhost:8443/nifi) and went to the accounts section and approved the account for mycert.p12 and chose a role of "NiFi'. We also need to give the mycert.p12 user access to the "Test" Output Port. So on the https instance I stopped "Test", right-clicked and Configure, and from the Access Controls tab started typing the DN from mycert.p12, added that user to the Allowed Users list, hit Apply and started the port again. Then I went back to nifi-http and right-clicked on the RPG and chose Refresh which caused it to retrieve the available Output Ports from nifi-https. I then connected the "Test" Output Port from the RPG to LogAttribute, started everything and it was able to pull FlowFiles from nifi-https.
... View more
03-19-2016
02:51 PM
2 Kudos
For this question you have to first take NiFi out of the picture and think about how you would index HTML with Solr. HTML is not typically one of the standard input formats like JSON, XML, and CSV, but Solr has an "extracting request handler" which is capable of handling HTML, see this page: https://wiki.apache.org/solr/ExtractingRequestHandler To use that from NiFi you need to set the "Content Stream Path" to "/update/extract", set the "Content Type" to "text/html", and add a user defined property for "literal.id" and set it to some id (you can use the FlowFile uuid by setting it to ${uuid}).
... View more
03-18-2016
07:26 PM
2 Kudos
Hello, There a couple of factors at play here... Site-to-Site uses the the same SSL configuration that is also used to configure the SSL for the UI. This is provided through nifi.properties: nifi.security.keystore=
nifi.security.keystoreType=
nifi.security.keystorePasswd=
nifi.security.keyPasswd=
nifi.security.truststore=
nifi.security.truststoreType=
nifi.security.truststorePasswd= So you should be able to have an http instance, meaning the UI is not configured with a secure https port, but you still configure the keystore/truststore properties above, and it will use those to connect to the secure NiFi instance. Secondly, on your https instance, if you set "nifi.remote.input.secure" to false then you should also be able to make a connection from from your http to https instance without configuring the above properties, but the connection will be unsecured in this case.
... View more
02-19-2016
10:00 PM
1 Kudo
The documentation for InvokeHttp says that Dynamic properties are sent as headers: Name Value Description Header Name Attribute Expression Language Send request header with a key matching the Dynamic Property Key and a value created by evaluating the Attribute Expression Language set in the value of the Dynamic Property.
Supports Expression Language: true So you should be able to add a property Content-Type with a value of application/json. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.InvokeHTTP/index.html
... View more
02-10-2016
07:27 PM
26 Kudos
In an Apache NiFi cluster, every node runs the same dataflow and data is divided between the nodes. In order to leverage the full processing power of the cluster, the next logical question is - "how do I distribute data across the cluster?".
The answer depends on the source of data. Generally, there are sources that can push data, and sources that provide data to be pulled. This post will describe
some of the common patterns for dealing with these scenarios.
Background
A NiFi cluster is made up of a NiFi Cluster Manager (NCM) and one or more nodes. The NCM does not perform any processing of data, but manages the cluster and provides the single point of access to the UI for the cluster. In addition, one of the processing nodes can be designated as a Primary Node. Processors can then be schedule to run on the Primary Node only, via an option on the scheduling tab of the processor which is only available in a cluster.
When connecting two NiFi instances, the connection is made with a Remote Process Group (RPG) which connects to an Input Port, or Output Port on the other instance. In the diagrams below, NCM will refer to the cluster manager, nodes refer to the nodes processing data, and RPG refers to Remote Process Groups. Pushing
When a data source can push it's data to NiFi, there will
generally be a processor listening for incoming data. In a cluster, this
processor would be running on each node. In order to get the data distributed
across all of the listeners, a load balancer can be placed in front of the
cluster, as shown in the following example:
The data sources can make their requests against the
url of the load balancer, which redirects them to the nodes of the cluster.
Other processors that could be used with this pattern are HandleHttpRequest,
ListenSyslog, andListenUDP. Pulling
If the data source can ensure that each pull operation will
pull a unique piece of data, then each node in the NiFi cluster can pull
independently. An example of this would be a NiFi cluster with each node
running a GetKafka processor:
Since each GetKafka processor can be treated as a single client through the Client Name and Group ID properties, each GetKafka processor will pull different data.
A different pulling scenario involves performing a listing
operation on the primary node and distributing the results across the cluster via site-to-site to pull the data in parallel. This typically involves "List" and
"Fetch" processor where the List processor produces instructions, or tasks, for the Fetch processor to act on. An example of this scenario is shown in the following diagram with ListHDFS
and FetchHDFS:
ListHDFS is scheduled to run on primary node and performs a
directory listing finding new files since the last time it executed. The
results of the listing are then sent out ListHDFS as FlowFiles, where each
FlowFile contains one file name to pull from the HDFS. These FlowFiles are then
sent to a Remote Process Group connected to an Input Port with in the same
cluster. This causes each node in the cluster to receive a portion of the files
to fetch. Each FetchFile processor can then fetch the files from HDFS in
parallel.
Site-To-Site
If the source of data is another NiFi instance (cluster or
standalone), then Site-To-Site can be used to transfer the data. Site-To-Site
supports a push or pull mechanism, and takes care of evenly pushing to,
or pulling from, a cluster.
In the push scenario, the destination NiFi has one or more
Input Ports waiting to receive data. The source NiFi brings data to a Remote
Process Group connected to an Input Port on the destination NiFi.
In the pull scenario, the destination NiFi has a Remote
Process Group connected to an Output Port on the source NiFi. The source NiFi
brings data the Output Port, and the data is automatically pulled by the
destination NiFi.
NOTE: These site-to-site examples showed a standalone NiFi communicating with a cluster, but it could be cluster to cluster.
... View more
Labels:
02-10-2016
06:29 PM
3 Kudos
Typically when referring to site-to-site we are referring to a Remote Process Group on one NiFi instance, communicating with an Input Port or Output Port on another NiFi instance. This is a TCP based protocol which is internal to NiFi, and makes a direct connection between the two instances. It can optionally be secured with TLS/SSL. The side of the connection that is receiving data, or providing data to be pulled, must configure the following properties in nifi.properties: # Site to Site properties nifi.remote.input.socket.host= nifi.remote.input.socket.port= nifi.remote.input.secure=true
... View more
02-01-2016
02:30 AM
2 Kudos
I think the schema needs to be a valid URI which would require the file protocol like this: file:///C:/Avro/schema1.avsc Additionally, the schema field also allows a schema to be pasted directly into the value of the field if you want to avoid pointing at a file.
... View more
12-21-2015
08:47 PM
1 Kudo
This turns out to be specific to using "unsigned int" which is essentially a Long, but we are generating an Avro schema that expects an "int". Some changes in 0.4.0 that fixed other issues with ExecuteSQL appear to have introduced this. I captured the issue with this JIRA: https://issues.apache.org/jira/browse/NIFI-1319
... View more
12-21-2015
06:57 PM
3 Kudos
Hi @Jobin George , I was trying to recreate this error... I have NiFi 0.4.0, MySQL 5.6.26, mysql-connector-java-5.1.38-bin.jar. Created the same table as you and inserted three rows: CREATE TABLE SALARIES ( ID int NOT NULL AUTO_INCREMENT, ZIPCODE int, SALARY double, AGE int, GENDER varchar(255), PRIMARY KEY (ID) ); INSERT INTO SALARIES (ZIPCODE, SALARY, AGE, GENDER) VALUES (12345, 100, 30, 'MALE'); INSERT INTO SALARIES (ZIPCODE, SALARY, AGE, GENDER) VALUES (12345, 200, 31, 'MALE'); INSERT INTO SALARIES (SALARY, AGE, GENDER) VALUES (10, 20, 'MALE');
In NiFi I have ExecuteSQL with "select * from salaries;" -> ConvertAvroToJson -> LogAttribute and I see this in the logs: [{"ID": 1, "ZIPCODE": 12345, "SALARY": 100.0, "AGE": 30, "GENDER": "MALE"},{"ID": 2, "ZIPCODE": 12345, "SALARY": 200.0, "AGE": 31, "GENDER": "MALE"},{"ID": 3, "ZIPCODE": null, "SALARY": 10.0, "AGE": 20, "GENDER": "MALE"}]
If I change the query to "select id from salaries;" I see: [{"ID": 1},{"ID": 2},{"ID": 3}] Is there anything that jumps out at you as being different between your setup and mine? different versions of mysql? something specific in your data?
... View more
- « Previous
- Next »