- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Running NiFi outside a secured Hadoop Cluster
- Labels:
-
Apache NiFi
Created ‎10-19-2016 05:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a NiFi host doing ETL processing outside the Hadoop Cluster. The cluster is secured using Knox/Ranger and the only ports open are ssh to the Hadoop Edge Nodes, and Kafka queue. My question is what are the best options to write data into either HBase or Hive? Ideas I have are:
- Deploy a NiFi inside the cluster do a site to site (requires opening a firewall port)
- From NiFi write to the Kafka queue, and from inside the cluster write a java process to pull from the queue and output the data to the target (HBase or Hive)
- Any other sugestions?
Created ‎10-19-2016 05:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Option 1 seems fine if you are able to open the firewall port.
In option 2, rather than write a Java process, you could run a NiFi inside the secure cluster using ConsumeKafka to consume the messages and then use the appropriate follow on processors (PutHDFS, PutHiveQL, PutHBaseJson, etc). So you still use Kafka as the gateway into the cluster, but don't have to write any custom code.
Created ‎10-19-2016 05:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Option 1 seems fine if you are able to open the firewall port.
In option 2, rather than write a Java process, you could run a NiFi inside the secure cluster using ConsumeKafka to consume the messages and then use the appropriate follow on processors (PutHDFS, PutHiveQL, PutHBaseJson, etc). So you still use Kafka as the gateway into the cluster, but don't have to write any custom code.
Created ‎10-19-2016 06:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I like Bryan's suggestion. That's a good model for IoT as well with remote notes messaging in. You could have send messages between outside cluster and an inside secure cluster via MQTT, JMS, Kafka, SiteToSite. Then just one port and one controller set of IPs communicating with each other. An IoT or security gateway.
