Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Setting up NiFi to work with public GetHTTP via HTTPS

avatar
Explorer

I am setting up NiFi to be able to fetch data from public websites using HTTPS.

This is a very painful process because of the keystore. The host OS has all of the certificates of authority in 1 file.

/usr/share/pki/ca-trust-source/ca-bundle.trust.crt

Its a massive file of all of the public certificates of authority. How do I import that 1 very large file into the keystore?

I have our own internal self-signed CA successfully working with keystore+nifi but not the standard public CA's.

6 REPLIES 6

avatar
Cloudera Employee

You should be able to import the Java ca-bundle into NiFi's truststore using:

keytool -importkeystore -srckeystore /usr/share/pki/ca-trust-source/ca-bundle.trust.crt -destkeystore nifi/conf/truststore.jks

You can also import specific CA certificates one at a time to limit the number of remote servers that NiFi trusts.

avatar
Explorer

keytool error (likely untranslated): java.io.IOException: Invalid keystore format

The format of the
/usr/share/pki/ca-trust-source/ca-bundle.trust.crt
Is not compatible with any keystore formats. ca-bundle.trust.crt format is some openssl specific format, not java keystore. ca-bundle.trust.crt is a mash of 200 keys.


I think it requires some manipulation with openssl first. Even if I use openssl it will only grab 1 key of the 200 keys.

avatar
Cloudera Employee

Hi @Erik Anderson

You have a few options.

You can use the standard JDK cacerts file which is already in the JKS format. Simply configure the SSLContextService to use the Java default cacerts file as the truststore. This is located in various locations for different operating systems. My guess is you're using Redhat so it may be at /etc/pki/java/cacerts. For Mac it was /Library/Java/JavaVirtualMachines/jdk1.8.0_172.jdk/Contents/Home/jre/lib/security/cacerts.

I wasn't able to get hold of the same ca-bundle.trust.crt you are using to test it out. If it's a PEM format, you can try using this utility to convert a multi-part PEM into a JKS: https://github.com/use-sparingly/keyutil.

I did the following:

wget https://curl.haxx.se/ca/cacert.pem
wget https://github.com/use-sparingly/keyutil/releases/download/0.4.0/keyutil-0.4.0.jar
java -jar keyutil-0.4.0.jar -i --new-keystore myTrustStore.jks --password changeit --import-pem-file cacert-2018-10-17.pem

and then pointed my SSLContextService at the myTrustStore.jks. I was then able to pull down content from HTTPS sites. Note that I did not verify what CAs were contained in the haxx.se/ca/cacert.pem so I would verify that before using it on your own server.

You can also try using InvokeHTTP which has more advanced functionality than GetHTTP.

Let me know if this helps.

avatar
Explorer

I seem to have hit a brick wall.

It seems the NiFi HTTP processors (InvokeHTTP and GetHTTP) are not happy with wildcard certificates. Meaning, if a CA issues a common name cert like the below

* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 * Server certificate:
* subject: CN=*.weathersource.com,OU=PositiveSSL Wildcard,OU=Domain Control Validated
* start date: Aug 15 00:00:00 2018 GMT
* expire date: Aug 24 23:59:59 2020 GMT
* common name: *.weathersource.com

The processors wont work. Here is the errors that are thrown.

2018-10-19 16:59:29,190 ERROR [Timer-Driven Process Thread-1] o.a.nifi.processors.standard.InvokeHTTP InvokeHTTP[id=8cfcd96c-0166-1000-ea61-261c1a449c0e] Yielding processor due to exception encountered as a source processor: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure: {} javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure at com.ibm.jsse2.k.a(k.java:15) at com.ibm.jsse2.k.a(k.java:23) at com.ibm.jsse2.av.b(av.java:343) at com.ibm.jsse2.av.a(av.java:981) at com.ibm.jsse2.av.i(av.java:869) at com.ibm.jsse2.av.a(av.java:19) at com.ibm.jsse2.av.startHandshake(av.java:672) at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at okhttp3.RealCall.execute(RealCall.java:69) at org.apache.nifi.processors.standard.InvokeHTTP.onTrigger(InvokeHTTP.java:791) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:522) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:319) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:191) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.lang.Thread.run(Thread.java:811)

NiFi should have used more of the host OS's environment to handle these things. Industry standard tools, like curl, have no issues. Am I missing something? Do I need to write custom processors for simple HTTP GET/POSTS because NiFi isnt friendly with enterprise environments?

IMO, NiFi is trying to be its own monolythic environment but it should utilize more of the underlying host OS to get around these corporate specific problems.

Erik

avatar
Cloudera Employee

Hi Erik,

I was able to use v1.7.0 and v1.8.0 NiFi versions of InvokeHTTP to access wildcard CNs. I was not able to reproduce your error. The test configuration I used was a wildcard CN of CN=*.natohorton.com self signed certificate in a JKS, with a simple hosted 'Hello World' HTTPS page using Flask. I then pointed the StandardSSLContextService in NiFi to this JKS as the truststore and pointed InvokeHTTP to https://test.natohorton.com:5000. The InvokeHTTP was able to retrieve the page.

Are you able to provide more details about your error for me to replicate?

As far as using the OS to handle more operations, my impression would be that the design is such that it can be cross-platform, portable and avoid compatibility issues with underlying operating system changes. It should also be higher performance by directly using Java interfaces rather than frequently making calls to external processes. NiFi was intended to handle large data volumes in a performant manner, extensible, cross-compatible with multiple operating systems whilst also providing a simple user interface for users with a wide spread of technical skills. That combination of requirements likely restricted the design from making more use of existing operating system utilities like curl.

Someone else might be able to comment more about those design decisions.

Thanks,
Nathan

avatar
Explorer

A newer version JRE fixed the above problem. It all seems to be working.