Member since
04-05-2016
130
Posts
93
Kudos Received
29
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3747 | 06-05-2018 01:00 AM | |
5111 | 04-10-2018 08:23 AM | |
5614 | 07-18-2017 02:16 AM | |
2889 | 07-11-2017 01:02 PM | |
3319 | 07-10-2017 02:10 AM |
03-22-2017
12:28 AM
@Kiran Dasari t2.micro instance has only 1 vCPU core, but apparently RHEL OpenJDK assumes multi core machine by default. There's a github issue (not for NiFi, but Elasticsearch) suggesting use '-XX:-AssumeMP' to disable this assumption. You can add JVM argument from <NIFI_HOME>/conf/bootstrap.conf. I tried this option with t2.micro. Without this, I got the same error as you reported. By adding the option NiFi proceeded startup further, but I also needed to reduce JVM heap size to have NiFi up and running. Changed bootstrap.conf as follows: # JVM memory settings: from 512m to 256m
java.arg.2=-Xms256m
java.arg.3=-Xmx256m
# Add this to disable AssumeMP
# https://github.com/elastic/elasticsearch/issues/22245
java.arg.15=-XX:-AssumeMP
NiFi is running without issue for 30 minutes. I hope it's longer enough to confirm that the configuration can address your issue, too. Thanks, Koji
... View more
03-10-2017
12:47 AM
1 Kudo
How to connect GetKafka to Kafka through Stunnel Stunnel is a proxy that can make insecure network transmission secure by wrapping it with SSL. This article contains example and illustrations describing how it works and how to configure it. Most part of it is derived from this informative Git comment I wouldn't be able to set it up without this comment. Thank you for sharing such detailed example.
How it works? Let's see how it can be applied to NiFi GetKafka.
I used two servers for this experimentation. 0 and 1.server.aws.mine. A single Zookeeper and Kafka broker is running on 0.server.
A GetKafka NiFi processor in 1.server consumes messages through Stunnel: Kafka Broker joins the Kafka cluster and declares its address as 127.0.0.1:9092 . If Zookeeper is in different server (recommended) and you need to secure this connection via Stunnel as well, then you need to apply the same method as the one used between GetKafka and Zookeeper. GetKafka's Zookeeper Connection String is set to 127.0.0.1:2181 which is local Stunnel is listening to. Then the local Stunnel on 1.server proxies the request to 0.server:2181 over SSL. At 0.server, the request is proxied again by the Stunnel running on 0.server, then finally arrives at Zookeeper. Since the Kafka Broker running on 0.server declares its address as 127.0.0.1:9092 , GetKafka (Kafka client) sends request to 127.0.0.1:9092 , and the request eventually transferred to the Broker through Stunnel pair. Here is the relevant configurations in 1.server's stunnel.conf (entire file is available here😞 client = yes
[zookeeper]
accept = 127.0.0.1:2181
connect = 0.server.aws.mine:2181
[kafka]
accept = 127.0.0.1:9092
connect = 0.server.aws.mine:9092
And this is for 0.server (entire file is available here😞 client = no
[zookeeper]
accept = 0.server.aws.mine:2181
connect = 127.0.0.1:2181
[kafka]
accept = 0.server.aws.mine:9092
connect = 127.0.0.1:9092
Kafka server.properties: host.name=127.0.0.1
zookeeper.connect=127.0.0.1:2181
Zookeeper zookeeper.properties clientPort=2181
clientPortAddress=127.0.0.1
How to authorize client access? Each Stunnel server has to have its own pem file containing a private key and a certificate. Also, a CA certificate file (or directory) is also needed to authorize client access. I used tls-toolkit.sh that is available in NiFi toolkit, to generate required files. Toolkit can generate three files, keystore.jks , truststore.jks and nifi.properties for each server. Server's key and cert can be extracted from keystore.jks. To do so, convert keystore.jks into keystore.p12 file by following commands (credit goes to this Stackoverflow) : # It's not important which server to run the toolkit on.
$ ./bin/tls-toolkit.sh standalone -n [0-1].server.aws.mine -C 'CN=server,OU=mine'
# Password for keystore.jks can be found in generated nifi.properties 'nifi.security.keystorePasswd'.
$ keytool -importkeystore -srckeystore keystore.jks -destkeystore keystore.p12 -srcstoretype jks -deststoretype pkcs12
Then extract key and cert from the p12 file:
$ openssl pkcs12 -in keystore.p12 -nokeys -out cert.pem $ openssl pkcs12 -in keystore.p12 -nodes -nocerts -out key.pem
Concatenate key and cert to create stunnel.pem, and deploy stunnel.pem to servers:
$ cat key.pem cert.pem >> stunnel.pem I used cert.pem as the CAFile for Stunnel on 0.server. In stunnel.conf on 0.server, following settings are needed to enable client cert verification: verify = 3
CAFile = /etc/stunnel/certs
Refer Stunnel manual for further description on these configurations. I confirmed that GetKafka running on 1.server can consume messages through Stunnel. If I used a cert which is not configured in the certs file on 0.server, GetKafka got timeout exception as follows: 2017-03-09 06:50:48,690 WARN [Timer-Driven Process Thread-5] o.apache.nifi.processors.kafka.GetKafka GetKafka[id=b0a21b5d-015a-1000-fbba-2648095ae625] Executor did not stop in 30 sec. Terminated.
2017-03-09 06:50:48,691 WARN [Timer-Driven Process Thread-5] o.apache.nifi.processors.kafka.GetKafka GetKafka[id=b0a21b5d-015a-1000-fbba-2648095ae625] Timed out after 60000 milliseconds while waiting to get connection
java.util.concurrent.TimeoutException: null
at java.util.concurrent.FutureTask.get(FutureTask.java:205) [na:1.8.0_121]
at org.apache.nifi.processors.kafka.GetKafka.onTrigger(GetKafka.java:348) ~[nifi-kafka-0-8-processors-1.1.2.jar:1.1.2]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-1.1.2.jar:1.1.2]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.2.jar:1.1.2]
Stunnel commands # Install
sudo yum -y install stunnel
# Edit config
sudo vi /etc/stunnel/stunnel.conf
# Start
sudo stunnel
# Stop
sudo kill `cat /var/run/stunnel.pid`
Conclusion Although Stunnel works with GetKafka and Kafka 0.8.x, I recommend to use newer version of Kafka and ConsumeKafka NiFi processor with SSL if possible. As it's written in the Git comment, this workaround is not scalable (in terms of required administration tasks) and complicated.
... View more
Labels:
03-05-2017
11:43 PM
Glad to hear you figured it out. Thanks!
... View more
03-05-2017
11:34 PM
1 Kudo
Hello @John Preston Yes, it's possible to pass custom HTT headers from HandleHttpResponse via dynamic properties. At HandleHttpResponse processor configuration tab, you can add a dynamic property by clicking the 'plus' sign, then enter its name which will be a name of the custom HTTP header, and specify a NiFi Attribute Expression Language (EL) as its value. You can find more about EL here NiFi Expression Language Guide. Test sending a HTTP request via cURL command: $ curl -i localhost:8081
HTTP/1.1 200 OK
Date: Sun, 05 Mar 2017 23:26:00 GMT
NiFi FileName: 100866877045484
Transfer-Encoding: chunked
Server: Jetty(9.3.9.v20160517)
{"Result": "OK"}
Thanks, Koji
... View more
02-27-2017
11:41 PM
1 Kudo
Hello @Faruk Berksoz In the screenshot, TailFile Task/Time is shown as 30 (times) in the last 5 min (300 secs). If you scheduled TailFile to run every 10 sec, I think the stats is correct. It seems TailFile is scheduled correctly but there hasn't been new lines found so no FlowFile is produced by TailFile. When new lines are added to the file that being watched by the TailFile, those will be picked up by TailFile and passed to PublishKafka. If you're sure that new lines are appended but not seeing any data is ingested to NiFi, then please elaborate the issue. Regards, Koji
... View more
02-20-2017
11:27 PM
Thanks for the update, glad to hear you managed to make it working!
... View more
02-20-2017
08:58 AM
Hi @Oliver Meyn As you suspected, NiFi Site-to-Site requires direct peer-to-peer communication. Your previous log seems that the client was able to retrieve remote cluster topology (server1.local and server2.local) by sending a request through the proxy. But it can't talk to those nodes directly. I haven't tried to put a reverse proxy such as HAProxy for NiFi Site-to-Site, because Site-to-Site handles load distribution by itself. If your client needs to go through a proxy server because of a restricted firewall or networking, then I'd recommend to use forward proxies with HTTP transport protocol as Bryan answered. Having said that, setting 'nifi.remote.input.host=proxy.local' in nifi.properties on each node in the remote cluster might work as it introduces every node as 'proxy.local' for Site-to-Site communication, thus further communication goes through the proxy. But I can't guarantee if this works. Thanks, Koji
... View more
02-16-2017
12:13 AM
@Ali Mohammadi Shanghoshabad I'm not aware of other processors to work with Solr, basically if your local NiFi can't access to the Solr instance in the sandbox VM, any other processor wouldn't work. To access a NiFi running on the VM, you need to edit local hosts file, with the VM ip-address, please double check that you use VM ip-address. The tutorial says:
After it boots up, find the IP address of the VM and add an entry into your machines hosts file. For example: 192.168.191.241 sandbox.hortonworks.com sandbox
... View more
02-15-2017
11:59 PM
Hi @AViradia, No, I haven't. But I just found this Stackoverflow. If you're using ConsumeKafka, update the processor to use new consumer group id would be a simple work-around to reprocess from the beginning. http://stackoverflow.com/questions/37741936/how-to-delete-kafka-consumer-created-via-new-consumer-api
... View more
02-15-2017
01:08 AM
Hi @Ali Mohammadi Shanghoshabad, Thanks for the updates and details. I understand the situation. You are running a NiFi locally trying to connect a zookeeper on localhost:2181, that is actually running on the sandbox vm, but 2181 is exposed by port forwarding. And after NiFi connects to Zk, Zk returns a Solr endpoint URL as "http://172.17.0.2:8983/solr/tweets_shard1_replica1/". But 172.17.0.2:8983 is not accessible directly from NiFi running on your local machine. The tutorial you're referring is designed to use a NiFi running in the sandbox, which can be accessed using http://sandbox.hortonworks.com:9090/nifi You also need to add an entry for sandbox.hortonworks.com in the hosts file on your local machine, as written in the tutorial. A possible solution would alter VM network configuration, but it can be difficult since sandbox is basically designed to work on an assumption to run everything within itself..
... View more