Member since
10-09-2015
86
Posts
179
Kudos Received
8
Solutions
01-25-2017
06:49 PM
@Yumin Dong Which version of HDF/NiFi are you using? if the latest one, I hope you downloaded the latest version of dependencies. Let me know. Jobin
... View more
01-25-2017
04:28 AM
6 Kudos
Introduction By integrating with LDAP, username/password authentication can be enabled in NiFi. This tutorial provides step by step instructions to setup NiFi - LDAP Authentication via Ambari (Using Knox Demo Ldap Server) Prerequisite 1) Assuming you already have HDF-2.x Installed on your VM/Server, Ambari, NiFi is up and running with out security. If not, I would recommend "Ease of Deployment" section of this article to install it [You can also follow this article for Automated installation of HDF cluster or refer hortonworks.com for detailed steps] Setting up Demo LDAP Server 1) As HDF and HDP cannot co-exist on a single node, lets download knox zip file from apache for this tutorial for easily setting up an ldap server. Execute below steps for the same after establishing ssh connectivity to the VM/Server (name of my host is node1): # ssh node1
# mkdir /opt/knox/
# cd /opt/knox/
# wget http://mirror.cogentco.com/pub/apache/knox/0.11.0/knox-0.11.0.zip
# unzip knox-0.11.0.zip
# /opt/knox/knox-0.11.0/bin/ldap.sh start 2) Make sure ldap server is started and running on port 33389 on your server # lsof -i:33389
OR
# netstat -anp | grep 33389 3) Below credentials are part of knox demo ldap we just started. We can use any of the users to login to NiFi after integration. tom/tom-password
admin/admin-password
sam/sam-password
guest/guest-password Configuring NiFi for LDAP Authentication via Ambari 1. Login to Ambari UI in the server URL, Click on the NiFi service à and then click on Config tab, expand “Advanced nifi-ambari-ssl-config ” section, update configuration as below: Initial Admin Identity : uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
Enable SSL? : {click check box}
Key password : hadoop
Keystore password : hadoop
Keystore type : JKS 2. Enter Below as the Truststore and DN configurations : Truststore password : hadoop
Truststore type : JKS
NiFi CA DN prefix : CN=
NiFi CA DN suffix : , OU=NIFI 3. Provide the configuration as below for Node Identities and keystore details: NiFi CA Force Regenerate? : {click check box}
NiFi CA Token : hadoop
Node Identities :
<property name="Node Identity 1">CN=node1, OU=NIFI</property>
Tip: Say If I am having a 3 node cluster with node1, node2 and node3 as part f it, above configuration looks like below: <property name="Node Identity 1">CN=node1, OU=NIFI</property>
<property name="Node Identity 2">CN=node2, OU=NIFI</property>
<property name="Node Identity 3">CN=node3, OU=NIFI</property> 4. In the Ambari UI, choose NiFi service and select config tab. We have to update two set of properties, in the “Advanced nifi-properties ” section update nifi.security.user.login.identity.provider as ldap-provider nifi.security.user.login.identity.provider=ldap-provider
5. Now in the “Advanced nifi-login-identity-providers-env ” section, update the “Template for loginidentity- providers.xml “ property with below configurations just above </loginIdentityProviders> <provider>
<identifier>ldap-provider</identifier>
<class>org.apache.nifi.ldap.LdapProvider</class>
<property name="Authentication Strategy">SIMPLE</property>
<property name="Manager DN">uid=admin,ou=people,dc=hadoop,dc=apache,dc=org</property>
<property name="Manager Password">admin-password</property>
<property name="TLS - Keystore">/usr/hdf/current/nifi/conf/keystore.jks</property>
<property name="TLS - Keystore Password">hadoop</property>
<property name="TLS - Keystore Type">JKS</property>
<property name="TLS - Truststore">/usr/hdf/current/nifi/conf/truststore.jks</property>
<property name="TLS - Truststore Password">hadoop</property>
<property name="TLS - Truststore Type">JKS</property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol">TLS</property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldap://node1:33389</property>
<property name="User Search Base">ou=people,dc=hadoop,dc=apache,dc=org</property>
<property name="User Search Filter">uid={0}</property>
<property name="Authentication Expiration">12 hours</property>
</provider> 6. Once All properties are updated, click save and when prompted, click restart. 7. Once restarted, you can try connecting to nifi URL, you should see the below screen, enter credentials as below for admin user the configured Initial Admin Identity and click LOG IN https://node1:9091/nifi/ --> in my case host is node1
admin/admin-password 8. You should be able to login as Admin user for NiFi and should see the below UI: Adding a User and Providing Access to UI 1) Let us go ahead and create a user jobin in ldap so that we can give access for him to NiFi UI. 2) Edit the users.ldif file with below entry in the knox/conf directory and restart the server: # vi /opt/knox/knox-0.11.0/conf/users.ldif Add below entry to the end of the file # entry for sample user jobin
dn: uid=jobin,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: jobin
sn: jobin
uid: jobin
userPassword:jobin-password 3) Once added lets stop and start the ldap server: # /opt/knox/knox-0.11.0/bin/ldap.sh stop
# /opt/knox/knox-0.11.0/bin/ldap.sh start 4) While logged in as admin on the nifi UI, Lets us add a user jobin with below id by clicking '+ user' button on top right 'users' menu like below: uid=jobin,ou=people,dc=hadoop,dc=apache,dc=org Enter the above value and click OK. 5. Now close the users window and click to open 'policies' window on the management menu on the top right corner below 'users' menu. click "+user" button on right top corner, on the pop up, enter jobin and select the user and click OK. 6. Once policy added, it will look like below: 7. Now you may log out as admin and provide below credentials to login as 'jobin' user, jobin/jobin-password 8. you should be able to login and view the UI, but wont have privilege to add anything to the canvas. (as jobin is given only read access) you may login back as admin and give required access.
This completes the tutorial, You have successfully: - Installed and Configured HDF 2.0 on your server. - Downloaded and started knox Demo Ldap Server - Configured NiFi to use Knox Ldap to Authenticate users where NiFi Initial Admin is from Ldap. - Restarted NiFi and verified access for admin user in NiFi UI. - Created a new user jobin in ldap, added him to NiFi user list and gave read access. - Verified access for user jobin Thanks, Jobin George
... View more
01-25-2017
12:12 AM
4 Kudos
Introduction
Using NiFi, data can be exposed in such a way that a receiver can pull from it by adding an Output Port to the root process group. For Storm, we will use this same mechanism - we will use the Site-to-Site protocol to pull data from NiFi's Output Ports. In this tutorial we learn to capture NiFi app log from the Sandbox and parse it using Java regex and ingest it to Phoenix via Storm or Directly using NiFi PutSql Processor.
Prerequisites 1) Assuming you already have latest version of NiFi-1.x/HDF-2.x downloaded as zip file (HDF and HDP cannot be managed by Ambari on same nodes as of now) on to your HW Sandbox Version 2.5, else execute below after ssh connectivity to sandbox is established: # cd /opt/
# wget http://public-repo-1.hortonworks.com.s3.amazonaws.com/HDF/centos6/2.x/updates/2.0.1.0/HDF-2.0.1.0-centos6-tars-tarball.tar.gz # tar -xvf HDF-2.0.1.0-12.tar.gz 2) Storm, Zeppelin are Installed on your VM and started. 3) Hbase is Installed with phoeix Query Server. 4) Make sure Maven is installed, if not already, execute below steps: # curl -o /etc/yum.repos.d/epel-apache-maven.repo https://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo
# yum -y install apache-maven
# mvn -version
Configuring and Creating Table in Hbase via Phoenix 1) Make sure Hbase components as well as phoenix query server is started. 2) Make sure Hbase is up and running and out of maintenance mode, below properties are set(if not set it and restart the services): - Enable Phoenix --> Enabled
- Enable Authorization --> Off 3) Create Phoenix Table after connecting to phoenix shell (or via Zeppelin): # /usr/hdp/current/phoenix-client/bin/sqlline.py sandbox.hortonworks.com:2181:/hbase-unsecure 4) Execute below in the Phoenix shell to create tables in Hbase: CREATE TABLE NIFI_LOG( UUID VARCHAR NOT NULL, EVENT_DATE DATE, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
CREATE TABLE NIFI_DIRECT( UUID VARCHAR NOT NULL, EVENT_DATE VARCHAR, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
Configuring and Starting NiFi 1) Open nifi.properties for updating configurations:
# vi /opt/HDF-2.0.1.0/nifi/conf/nifi.properties 2) Change NIFI http port to run on 9090 as default 8080 will conflict with Ambari web UI # web properties #
nifi.web.http.port=9090 3) Configure NiFi instance to run site-to site by changing below configuration : add a port say 8055 and set "nifi.remote.input.secure" as "false" # Site to Site properties #
nifi.remote.input.socket.port=8055
nifi.remote.input.secure=false 4) Now Start [Restart if already running for configuration change to take effect] NiFi on your Sandbox. # /opt/HDF-2.0.1.0/nifi/bin/nifi.sh start 5) Make sure NiFi is up and running by connecting to its Web UI from your browser: http://your-vm-ip:9090/nifi/
Building a Flow in NiFi to fetch and parse nifi-app.log 1) Let us build a small flow on NiFi canvas to read app log generated by NiFi itself to feed to Storm: 2) Drop a "TailFile" Processor to canvas to read lines added to "/opt/HDF-2.0.1.0/nifi/logs/nifi-user.log". Auto Terminate relationship Failure. 3) Drop a "SplitText" Processor to canvas to split the log file into separate lines. Auto terminate Original and Failure Relationship for now. Connect TailFile processor to SplitText Processor for Success Relationship. 4) Drop a "ExtractText" Processor to canvas to extract portions of the log content to attributes as below. Connect SplitText processor to ExtractText Processor for splits relationship. - BULLETIN_LEVEL : ([A-Z]{4,5})
- CONTENT : (^.*)
- EVENT_DATE : ([^,]*)
- EVENT_TYPE : (?<=\[)(.*?)(?=\]) 5) Drop an OutputPort to the canvas and Name it "OUT", Once added, connect "ExtractText" to the port for matched relationship. The Flow would look similar as below: 6) Start the flow on NiFi and notice data is stuck in the connection before the output port "OUT"
Building Storm application jar with maven 1) To begin with, lets clone the artifacts, feel free to inspect the dependencies and NiFiStormStreaming.java # cd /opt/
# git clone https://github.com/jobinthompu/NiFi-Storm-Integration.git 2) Feel free the inspect pom.xml to verify the dependencies. # cd /opt/NiFi-Storm-Integration
# vi pom.xml 3) Lets rebuild Storm jar with artifacts (this might take several minutes). # mvn package 4) Once the build is SUCCESSFUL, make sure the NiFiStormTopology-Uber.jar is generated in the target folder: # ls -l /opt/NiFi-Storm-Integration/target/NiFiStormTopology-Uber.jar 5) Now let us go ahead and submit the topology in storm (make sure the NiFi flow created above is running before submitting topology). # cd /opt/NiFi-Storm-Integration
# storm jar target/NiFiStormTopology-Uber.jar NiFi.NiFiStormStreaming & 6) Lets Go ahead and verify the topology is submitted on the Storm View in Ambari as well as Storm UI: Ambari UI: http://your-vm-ip:8080 Storm UI: http://your-vm-ip:8744/index.html 7) Lets Go back to the NiFi Web UI, if everything worked fine, the data which was pending on the port OUT will be gone as it was consumed by Storm. 😎 Now Lets Connect to Phoenix and check out the data populated in tables, you can either use Phoenix sqlline command line or Zeppelin a) via phoenix sqlline # /usr/hdp/current/phoenix-client/bin/sqlline.py localhost:2181:/hbase-unsecure SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='ERROR' ORDER BY EVENT_DATE; b) via Zeppelin for better visualization Zeppelin UI: http://your-vm-ip:9995/ 9) No you can Change the code as needed, re-built the jar and re-submit the topologies. Extending NiFi Flow to ingest data directly to Phoenix using PutSql processor 1) Lets go ahead and kill the storm topology from command-line (or from Ambari Storm-View or Storm UI) # storm kill NiFi-Storm-Phoenix 2) Log back to NiFi UI currently running the flow, and stop the entire flow. 3) Drop a RouteOnAttribute processor to canvas for Matched relation from ExtractText processor and configure it with below property and auto terminate unmatched relation. DEBUG : ${BULLETIN_LEVEL:equals('DEBUG')}
ERROR : ${BULLETIN_LEVEL:equals('ERROR')}
INFO : ${BULLETIN_LEVEL:equals('INFO')}
WARN : ${BULLETIN_LEVEL:equals('WARN')} 4) Drop an AttributesToJSON processor to canvas with below configuration and connect RouteOnAttribute's DEBUG,ERROR,INFO,DEBUG relations to it. Attributes List : uuid,EVENT_DATE,BULLETIN_LEVEL,EVENT_TYPE,CONTENT
Destination : flowfile-content 5) Create and enable DBCPConnectionPool with name "Phoenix-Storm" with below configuration: Database Connection URL : jdbc:phoenix:sandbox.hortonworks.com:2181:/hbase-unsecure
Database Driver Class Name : org.apache.phoenix.jdbc.PhoenixDriver
Database Driver Location(s) : /usr/hdp/current/phoenix-client/phoenix-client.jar
6) Drop a ConvertJSONToSQL to canvas with below configuration, connect AttributesToJSON's success relation to it, auto terminate Failure relation for now after connecting to Phoenix-Storm DB Controller service. 7) Drop a ReplaceText processor canvas to update INSERT statements to UPSERT for Phoenix with below configuration, connect sql relation of ConvertJSONToSQL auto terminate original and Failure relation. 😎 Finally add a PutSQL processor with below configurations and connect it to ReplaceText's success relation and auto terminate all of its relations. 9) The final flow including both ingestion via Storm and direct to phoenix using PutSql is complete, it should look similar to below: 10) Now go ahead and start the flow to ingest data to both Tables via storm and directly from NiFi. 11) Login back to Zeppelin to see if data is populated in the NIFI_DIRECT table. %jdbc(phoenix)
SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='INFO' ORDER BY EVENT_DATE - Too Lazy to create flow??? download my flow template here. This completes the tutorial, You have successfully: - Installed and Configured HDF 2.0 on your HDP-2.5 Sandbox. - Created a Data flow to pull logs and then to Parse it and make it available on a Site-to-site enabled NiFi port. - Created a Storm topology to consume data from NiFi via Site-to-Site and Ingest it to Hbase via Phoenix. - Directly Ingested Data to Phoenix with PutSQL Processor in NiFi with out using Storm - Viewed the Ingested data from Phoenix command line and Zeppelin References: bbende's - nifi-storm Github Repo Thanks, Jobin George
... View more
12-14-2016
12:06 AM
Great!!.. may be you can include the back pressure coloring feature and a screenshot as well..
... View more
08-12-2016
06:22 PM
Hi @Deep Doradla, Can you verify if you have ssl certificate details provided in the step 7? as well as all the tags are in step7 above are in place? Thanks, Jobin
... View more
08-03-2016
03:38 AM
@Timothy Spann, Sorry for the late reply, didn't get the update as i was not tagged. Attaching it here: yarn-application-monitor.xml. Thanks, Jobin George
... View more
07-02-2016
01:19 PM
9 Kudos
Introduction
- Here is a small demo how NiFi can help you monitor and alert on YARN Application failure.
- Here you can view the screen recording that Demonstrates how it works! Prerequisite
- Make
sure you have your HDP cluster/Sandbox up and running.
- NiFi_0.6.1//HDF_1.2 is available up and running. Steps:
1.
Assuming you have NiFi UI Available, lets drop
GetHTTP processor to pull data from YARN REST API:
Configure Processor with URL given as below, which pulls all Applications in Killed and Failed state. node1 is my Resource Manager:
http://node1:8088/ws/v1/cluster/apps?states=KILLED,FAILED
Lets
schedule the processor to run only every 10sec so that you don’t query too
often.
2.
As the Rest call outputs the application details in Json format, lets use a
SplitJson processor to separate
individualapplication details.
Provide “JsonPath Expression” value as “$.apps.app” in the configuration.
3.
Connect
GetHTTP to SplitJson for success relation and auto
terminate rest.
4. Lets add
EvaluateJsonPath
processor to extract required fields and add them to flow-file
attribute: Configure it as below:
5. Connect
SplitJson
to EvaluateJsonPath for success
relation.
6. Create and start two controller services:
DistributedMapCacheClientService, DistributedMapCacheServer so that we
keep track of all the applications and don’t sent out duplicate alerts for same
application.
7. Add a
PutDistributedMapCache
processor to update the cache with latest apps that fails/killed. Configure
it as below adding Distributed cache service.
8.
Lets auto terminate Failure relationship and connect success relationship to
PutEmail processorwhich will sent out email for any new failed/killed application.
9.
Make sure you have formatted the email body and subject to have all information
about the failed job:
10.
Auto terminate success and failure relationship for
PutEmail processor. Once you start the Flow, you will get alerts
for each Killed/Failed Yarn application. My Alert would look like below:
Note: Now you can configure your GetHTTP Processor to query YARN to find long running applications Thanks, Jobin George
... View more
Labels:
07-01-2016
07:09 AM
Hi @Saisubramaniam Gopalakrishnan, I haven't Tried it, but you can add an input port to the root nifi canvas and try communicating from spark(nifi site-to-site client), and then push it to other processors as required. Thanks, Jobin
... View more
05-18-2016
10:51 PM
10 Kudos
Prerequisite 1. You have HDF-1.2
installed on your server 2. Make sure KDC is
installed on your server and is started, will try to describe steps briefly in
the tutorial, below is the link to detailed steps from HDP documentation: HDP-Documenation-for-Kerberos Installing and Configuring KDC: 1. Lets install a new version of KDC server: # yum -y install krb5-server krb5-libs krb5-auth-dialog krb5-workstation 2. Using a text editor, open the KDC server configuration file, located by default
here: # vi /etc/krb5.conf
[realms]
EXAMPLE.COM = {
kdc = node1
admin_server = node1 } add your host name, mine is node1. 3. Use the utility kdb5_util to create the Kerberos database, when asked lets put password ‘hadoop’: # kdb5_util create –s 4. Lets Start the KDC server and the KDC admin server, set them to auto-start on
boot # /etc/rc.d/init.d/krb5kdc start
# /etc/rc.d/init.d/kadmin start
# chkconfig krb5kdc on
# chkconfig kadmin on 5.
Lets add a service principal for a server and export the keytab from the KDC: # kadmin.local
# addprinc -randkey nifi/HDF
# ktadd -k /opt/nifi-HDF.keytab nifi/HDF
# q 6. Make sure “/opt/nifi-HDF.keytab” is generated and is available. 7. Lets make some login Identities and set password as ‘hadoop’ which we will be using to login in the UI: # kadmin.local -q "addprinc jobin/node1"
# kinit jobin/node1@EXAMPLE.COM
# kadmin.local -q "addprinc george/node1"
# kinit george/node1@EXAMPLE.COM
Configuring NiFi: 1. NiFi will only
respond to Kerberos SPNEGO negotiation over an HTTPS connection, as unsecured
requests are never authenticated. For that you will need to enable 2-way SSL. I already did created Certification
Authorities and client certificates at www.tinycert.org If you are too lazy to create them, try with mine 🙂 [Attached as certificates.zip] - Use cert-browser.pfx to load into browser to be a NiFi administrator 'DEMO' - Upload other two certificates to your server under '/root/scripts/' and execute below commands, while executing last command enter 'hadoop' as password and 'yes' when asked if it can be trusted. # cd /root/scripts/
# mv cert.pfx cert.p12
# openssl x509 -outform der -in cacert.pem -out cacert.der
# keytool -import -keystore cacert.jks -file cacert.der certificates.zip 2. My keystore is saved
as ‘/root/scripts/cert.p12’ and a truststore
is saved as ‘/root/scripts/cacert.jks’.
and password is set as hadoop. 3. Below are
the configuration updates you have to do in nifi.properties
file in node1: # vi /opt/nifi-1.1.0.0-10/conf/nifi.properties 4. Once
opened in editor update below properties to given values [updating https port
and certificate details]: nifi.web.http.host=
nifi.web.http.port=
nifi.web.https.host=node1
nifi.web.https.port=9090
nifi.security.keystore=/root/scripts/cert.p12
nifi.security.keystoreType=PKCS12
nifi.security.keystorePasswd=hadoop
nifi.security.keyPasswd=hadoop
nifi.security.truststore=/root/scripts/cacert.jks
nifi.security.truststoreType=JKS
nifi.security.truststorePasswd=hadoop Now Lets put
Kerberos details in nifi.properties in “kerberos”
section: # kerberos #
nifi.kerberos.krb5.file=/etc/krb5.conf
nifi.kerberos.service.principal=nifi/HDF@EXAMPLE.COM
nifi.kerberos.keytab.location=/opt/nifi-HDF.keytab
nifi.kerberos.authentication.expiration=12 hours Also make
sure you update two properties as below: nifi.security.user.login.identity.provider=kerberos-provider
nifi.login.identity.provider.configuration.file=./conf/login-identity-providers.xml 5. Now configure
the authorized users in ‘authorized-users.xml’
file, configuration of user is based on certificate. Configure it exactly as below you have in certificate I have attached in step1. vi /opt/nifi-1.1.0.0-10/conf/authorized-users.xml <user dn="CN=Demo, OU=Demo,O=Hortonworks, L=San Jose, ST=California, C=US">
<role name="ROLE_ADMIN"/>
</user> 6. Above configuration is to login as NiFi Administrator, every
other users can be pulled from Kerberos after this administrator assigns roles
on request. 7. Configure ./conf/login-identity-providers.xml as below
with reference to Kerberos Configured [Make sure you have removed xml comments
tag]. <provider>
<identifier>kerberos-provider</identifier>
<class>org.apache.nifi.kerberos.KerberosProvider</class>
<property name="Default Realm">EXAMPLE.COM</property>
<property name="Kerberos Config File">/etc/krb5.conf</property>
<property name="Authentication Expiration">12 hours</property>
</provider> 8. Once configured, restart NiFi server. # /opt/HDF-1.2.0.0/nifi/bin/nifi.sh restart 9. Now say open
‘Chrome’ browser and load client
certificate [cert-browser.pfx] associated with ADMIN user and login to secure https url of NiFi
running on node1: https://node1:9090/nifi 10. When
asked, confirm for security exception and proceed. Now you are securely logged
in as Demo user with admin privileges. You can now grant access to any user
requesting access. 11. Open another browser
say ‘Safari’ to establish another session https://node1:9090/nifi It will
popup below screen for login, enter any of the credentials for identities we
just created in step 7, “Configuring KDC” Username: jobin/node1 Password: hadoop
Username: george/node1 Password: hadoop 12. Enter
the password as hadoop and hit login, enter justification and it will show up
below screen that request is pending with Admin who already have access using
certificates. 13. Now go
back to chrome browser where ‘Demo’ user is NiFi Administrator and assign role
to jobin. 14. Now you
can see that the user is active: 15. Go back
to the old session, as tom in safari, refresh the browser and you will be
logged in as ‘jobin’ with privileges assigned by NiFi administrator. You can
test if for other user ‘george’ as well. Now you have Authenticated two users jobin and george to access NiFi User Interface. Hope this will be useful !! Thanks, Jobin George
... View more
Labels:
- « Previous
-
- 1
- 2
- Next »