- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 07-28-2016 03:41 PM - edited 08-17-2019 11:09 AM
HDF Overview
Overview
Hortonworks DataFlow (HDF) powered by Apache NiFi, Kafka and Storm, collects, curates, analyzes and delivers real-time data from the IoAT to data stores both on-premises and in the cloud. This is the quick installation guide to install Apache NiFi on AWS EC2 instance. Please refer this document as supplement guide to official Hortonworks HDF documentation.
Prerequisites
Before you install Apache NiFi on AWS, make sure
- You have AWS account. (https://aws.amazon.com/)
- Amazon key pair to access EC2 instance to run HDP platform. (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pai...
Installation Steps.
The screenshots in this section detail the setup and configuration of Apache NiFi on EC2 instance.
Refer the NiFi Admin Guide for the System requirements. This document covers installation on a Redhat linux (64 bit) EC2 instance.
Login to AWS and launch the EC2 instance and OS of your choice. (Please make sure the selected OS is supported by NiFi). The current exercise uses the Red Hat Enterprise Linux 7.2 image (HDF EC2 Instance).
Make sure you keep the security private key safe. Under Network and Security configuration, open the Http ports (e.g. 8081 & 8082 shown below) to access the NiFi web interface and for the site to site protocol to exchange data between multiple NiFi instances.
- Download the HDF from HDF Download Page. Either you can download it directly on your EC2 instance or you can upload the zip file to the EC2 instance from local using scp.
e.g. scp -i HDF.pem HDF-1.2.0.1-1.zip ec2-user@<public-dns-hostname>:/home/ec2-user
where HDF.pem is private key.
- Make sure you install the latest java and unzip on EC2
sudo yum install unzip
sudo yum install java
- Decompress/Unzip zip into desired installation directory.
- Make desired edits in nifi.properties file under <install_dir>/nifi/conf.
e.g. update the site-to-site properties to include the following
nifi.remote.input.socket.host=<public_dns_hostname>
nifi.remote.input.socket.port=8082
nifi.remote.input.secure=false
- From the <install_dir>/nifi/bin directory
execute the following commands by ./nifi.sh <command>
- start: starts NiFi in the background
- stop: stops NiFi that is running in the background
- status: provides the current status of NiFi
- run: runs NiFi in the foreground and waits for a Ctrl-C to initiate shutdown of NiFi
- install: installs NiFi as a service that can
then be controlled via
- service nifi start
- service nifi stop
- service nifi status
- start: starts NiFi in the background
- The following screenshots displays the NiFi running on EC2 instance with the sample dataflow.
Benefits
- Running a NiFi instance in AWS provides an easy to use, flexible and cost effective dataflow management solution in cloud.
- NiFi is a reliable, secure and scalable solution which gets additional benefits of AWS’ mature infrastructure solution.
- Using the NiFi site-to-site protocol eliminates the need to run software in the DMZ when exchanging data between on-prem and cloud.
Document References
NiFi System Admin Guide:
Created on 09-08-2016 10:21 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Milind - great overview.
Any recommendations around the type of instance we should use?
Created on 07-10-2017 05:11 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Is this article still valid for HDF version 3.0 which was released recently? Are there easier ways of deploying to Amazon?