Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Connecting to DataSift HTTPS API using NiFi GetHTTP

avatar
Expert Contributor

Hi all

Is it possible to use GetHttp processor in Nifi to connect to datasift streaming api and receive live streaming data. I have used Gethttp for http api but for https we need ssl context and username and password. Any ideas how to connect to https url with nifi?

1 ACCEPTED SOLUTION

avatar
Rising Star

You will need to create and configure an SSLContextService for the processor to use so that it can establish trust with the certificate being presented by the DataSift service. curl works because it is tying into the default system truststore for you.

To provide a similar experience as curl on the command line, you will need to configure the truststore properties for your SSL Context Service instance with:

  • Truststore Filename: the cacerts file from your Java installation
    • If $JAVA_HOME Is set on your system, it should help point you in the right direction. If not, the location of cacerts varies depending on environment, but is approximately the following for their respective OS
      • OS X: /Library/Java/JavaVirtualMachines/jdk<version>.jdk/Contents/Home/jre/lib/security/cacerts
      • Windows: C:\Program Files\Java\jdk<version>\jre\lib\security\cacerts
      • Linux: /usr/lib/jvm/java-<version>/jre/lib/security/cacerts -- You can additionally use $(readlink -f $(which java))
  • Truststore Type: JKS
  • Truststore Password: The default password of "changeit" if you are using the default Java keystore

When this controller service is created and enabled, the associated GetHTTP will need to be updated to reference it.

View solution in original post

10 REPLIES 10

avatar

Hi @Aldrin Piri, @omer alvi ,I am using GetHTTPS to ingest data from a Facebook page and sending it to PutHDFS. The roadblock is that once the data from that page is ingested and I stop the process, that particular url is not hit again when i restart the process. What are the modifications I can make to my url such that the process of data ingestion is a continuous process?

This is a sample of the url I am currently using :

https://graph.facebook.com/v2.11/542678889114683/?fields=name,likes,posts&access_token="my_access_token"&limit=100