Member since
10-09-2015
86
Posts
179
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
17690 | 12-29-2016 05:19 PM | |
943 | 12-17-2016 06:05 PM | |
9289 | 08-24-2016 03:08 PM | |
1056 | 07-14-2016 02:35 AM | |
2475 | 07-08-2016 04:29 PM |
07-26-2017
02:35 PM
Thanks @Matt Clarke
... View more
07-19-2017
07:56 PM
Good to know that!!.. thanks for the update @Wynner
... View more
07-19-2017
06:20 PM
@Wynner , Is this a mandatory manual step for NiFi node addition while done via Ambari? I am sure this worked fine with out copying flow.xml.gz in HDF-2.x. Or some work is going on to fix it?
... View more
07-19-2017
06:03 PM
hello @mqureshi, It was a fresh node. there were no flow.xml.gz file present at first, when I started the instance it created a 480b flow.xml.gz file later grow to 593b. I tried deleting it and start/restart NiFi on the fresh node, again a 480b file was created instead of 11880b cluster flow.xml.gz (copying that 11880b cluster flow should have been done by coordinator node but looks like its been done by Ambari - I guess) node4 is the new node and node1 is existing node.
... View more
07-19-2017
05:04 PM
Hello, While trying to add nodes to my existing NiFi cluster with Ambari, I am getting below error in 3.0 version of HDF. It used to work fine in HDF-2.x version. The issue is related to flow differences, but error messages indicates its the issue with AmbariReportingTaskAmbariReport finger print. Ambari used to take care of this in HDF-2.x version. Am I missing something here? Any additional steps which I should be doing in 3.0? I cross checked it with below article for HDF-2.0 by @Matt Clarke, except my cluster is unsecured no other differences: https://community.hortonworks.com/articles/80284/hdf-2x-adding-a-new-nifi-node-to-an-existing-secur.html Any pointers will be appreciated!
2017-07-19 12:39:03,469 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow.
org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow.
at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:936)
at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:515)
at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:800)
at org.apache.nifi.NiFi.<init>(NiFi.java:160)
at org.apache.nifi.NiFi.main(NiFi.java:267)
Caused by: org.apache.nifi.controller.UninheritableFlowException: Proposed configuration is not inheritable by the flow controller because of flow differences: Found difference in Flows:
Local Fingerprint: 7c84501d-d10c-407c-b9f3-1d80e38fe36a3b80ba0f-a6c0-48db-b721-4dbc04cef28eorg.apache.nifi.reporting.ambari.AmbariReportingTaskAmbariReport
Cluster Fingerprint: 7c84501d-d10c-407c-b9f3-1d80e38fe36a428574ae-015d-1000-0000-0000454185d7org.apache.nifi.processors.standard.TailFileNO_VALUEorg.apache.n
at org.apache.nifi.controller.StandardFlowSynchronizer.sync(StandardFlowSynchronizer.java:259)
at org.apache.nifi.controller.FlowController.synchronize(FlowController.java:1576)
at org.apache.nifi.persistence.StandardXMLFlowConfigurationDAO.load(StandardXMLFlowConfigurationDAO.java:84)
at org.apache.nifi.controller.StandardFlowService.loadFromBytes(StandardFlowService.java:722)
at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:911)
... 4 common frames omitted Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
06-09-2017
07:22 PM
hello @Arsalan Siddiqi, Here you can find spark integration with HDF-2.x (still only nifi-1.1) you can figure out the dependency from there.[ first step under:Configuring and Restarting Spark section] https://community.hortonworks.com/content/kbentry/84631/hdf-21-nifi-site-to-site-direct-streaming-to-spark.html Thanks
... View more
05-15-2017
01:37 AM
9 Kudos
Introduction
- Here is a small demo how to flex MiNiFi+NiFi on a Raspberry Pi to detect Motion and send alerts to your phone via SMS, also on your SMS reply it will trigger sound ALARM [Basically to shoo away an intruder]. You can do it from where ever you have cell phone reception.
- Here you can view the screen recording session that Demonstrates how it works! Prerequisite
Raspberry Pi 3, a PIR motion sensor and a speaker connected to it. Details on how to connect PIR Motion Sensor to Raspberry Pi can be found in url under references.
Assuming you already have latest version of HDF/NiFi and Minifi downloaded on your Mac and Pi. Else
Get Latest version of MiNiFi :
# wget http://apache.claz.org/nifi/minifi/0.1.0/minifi-0.1.0-bin.tar.gz
Get Latest version of MiNiFi ToolKit:
# wget http://apache.claz.org/nifi/minifi/0.1.0/minifi-toolkit-0.1.0-bin.tar.gz
Get Latest version of NiFi:
# wget http://apache.claz.org/nifi/1.2.0/nifi-1.2.0-bin.tar.gz
Untar the files and start NiFi on your local machine and MiniFi on your Raspberry Pi
Steps:
Flow on MiNiFi
Download the flow
Pi-MiNiFi-FLow.xml and convert it to YAML format which MiNiFi uses (before you deploy make sure you have your local NiFi URL for RPG rather than what I have in there)
# /root/minifi-toolkit-0.1.0/bin/config.sh transform Pi-MiNiFi-FLow.xml minifi-0.1.0/conf/config.yml
Flow Looks like below:
Flow Explained:
a) Poll for Sensor output and Sent it to NiFi
GenerateFlowFile processor triggers every 5seconds to execute a python script pirtest.py as below using an ExecuteStreamCommand proceesor, result is sent to NiFi running on my local machine via Remote Process Group.
pirtest.py script looks like below:
import RPi.GPIO as GPIO
import time
import os
GPIO.setwarnings(False)
GPIO.setmode(GPIO.BOARD)
GPIO.setup(11, GPIO.IN) #Read output from PIR motion sensor
i=GPIO.input(11)
if i==1: #When output from motion sensor is HIGH
print "Intruder detected",i
b) Play Panic Alarm on SMS trigger from NiFi
listenHTTP processor hosts and listens for any incoming flowfile, when arrived next processor ExecuteStreamCommand executes a python script alarm.py as given below which trigger a panic alarm sound to be played. I used mpg123 to play the alarm sound, you can install it on your raspberry pi using below command:
# sudo apt-get install mpg123
alarm.py script looks like below:
import os
os.system('mpg123 /root/alarm.mp3')
Flow on NiFi
Download the flow MiNiFi-MotionSensor-SMS-Alert+Response.xml and deploy it after updating your custom hostnames and other details details.
It looks like below:
Flow Explained:
a)Receiving Sensor Alert from MiNiFi and send SMS
An InputPort receives MotionSensor output from MiNiFi, RouteOnAttribute processor verifies the output and send it to a ControlRate processor only if motion is detected. Control rate processor ensures your phone is not flooded with alerts. putEmail processor is configured to send SMS to my phone.
b)Check for ALARM request and send signal to MiNiFi
ConsumeIMAP processor checks for new ALARM request in a specified folder in my mailbox, when received, triggers a flowfile. RouteOnContent processor verifies the new mail and route it based on sender and content, feeding it to a PostHTTP processor connecting to listenHTTP on MiNiFi triggering ALARM.
Now you its time to try out!!
References:
Raspberry Pi and PIR Motion Sensor article I followed
MiNiFi
NiFi
My GitHub Article
Thanks,
Jobin George
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- How-ToTutorial
- IoT
- minifi
- monitoring
- NiFi
Labels:
05-09-2017
06:39 AM
@Bharadwaj Bhimavarapu Not as of today. you need multiple MonitorDiskUsage tasks to do so. But I already did brought this up to product team and hopefully soon we may have that feature.
... View more
05-05-2017
07:33 PM
2 Kudos
Introduction Recently I was asked how to Monitor and alert while MonitorDiskUsage or MonitorMemory Reporting Task WARNINGS are generated when it exceeds a predefined threshold, I am trying to capture the steps to implement the same. Prerequisites
To test this, Make sure HDF-2.x version of NiFi is up an running. You Already have a MonitorDiskUsage/MonitorMemory Reporting task created and configured, set a lower threshold to force generate the alert while testing. 3. Make a note of the Reporting Task uuid to be monitored: Creating a Flow to Monitor Reporting Task.
Drop a GenerateFlowFile to execute every 5mins, with 1 byte file to trigger the flow. 2. Drop an UpdateAttribute processor to the canvas with below configuration: NIFI_HOST : <Your NiFi FQDN>
NIFI_PORT : <Your NiFi http port>
REPORTING-TASK-UUID : <Your MonitorDiskUsage or MonitorMemory controller service uuid> 3. Connect Success relationship of GenerateFlowFile to UpdateAttribute. 4. Drop a InvokeHTTP processor to the canvas, and configure it as below: HTTP Method : GET
Remote URL : http://${NIFI_HOST}:${NIFI_PORT}/nifi-api/reporting-tasks/${REPORTING-TASK-UUID} 5. Connect Success relation of UpdateAttribute to InvokeHTTP and auto terminate all relationships of InvokeHTTP processor but Response relationship. 6. Drop a EvaluateJsonPath processor to the canvas with below configuration: 7. Auto terminate All relationship of EvaluateJsonPath processor except Matched relationship and connect InvokeHTTP processor’s Response relation to it. 8. Drop a SplitJson processor to canvas with below Configurtaion. [The reason for splitting json is because the REST call to reporting task gives duplicate json array in the result, which contains details we need.] 9. Connect Matched relationship of EvaluateJsonPath to SplitJson processor. 10. Drop another EvaluateJsonPath processor to the canvas with below configuration: LEVEL : $.bulletin.level
MESSAGE : $.bulletin.message
SOURCE-NAME : $.bulletin.sourceName
TIMESTAMP : $.bulletin.timestamp 11. Add a connection with split relationship from SplitJson to second EvaluateJsonPath processor. Auto-terminate other relationships. 12. Add a ControlRate processor so that only one alert is sent for multiple json arrays with below configuration which passes only 1 flowfile per minute. 13. Add a connection from EvaluateJsonPath processor to ControlRate processor with FlowFile Expiration as 30 sec. Auto-terminate other relations. 14. Finally Drop an PutEmail processor to canvas with below configuration to sent your alerts, update with your SMTP details SMTP Hostname : west.xxxx.yourServer.net
SMTP Port : 587
SMTP Username : jgeorge@hortonworks.com
SMTP Password : Its_myPassw0rd_updateY0urs
SMTP TLS : true
From : jgeorge@hortonworks.com
To : jgeorge@hortonworks.com
Subject : ${SOURCE-NAME} ALERT and message content should look something like below to grab all the values: Message : ${MESSAGE}
LEVEL : ${LEVEL}
TIMESTAMP : ${TIMESTAMP}
SOURCE-NAME : ${SOURCE-NAME} 15. Auto terminate all relationships of PutEmail processor and connect Success relationship of ControlRate processor to it. 16. Now you have your flow ready and can start it to monitor and sent Email Alert for UI Notification thrown by NiFi when Disk or Memory utilization exceeds given threshold. The Flow Would Look like below: Staring the flow, Reporting task and testing it 1. Now lets start the MonitorDiskUsage Reporting task to generate an Alert to test it with below configuration with a lower threshold to force generate the alert. I am monitoring disk on my mac and it’s disk is more than 50% utilized, so I will get this WARNING in the NiFi UI. 2. As soon as this Warning show up and the flow created is running you will get alert in your inbox stating the same, similar as below: 3. This concludes the tutorial for Monitoring your Reporting Tasks with NiFi itself. 4. Too lazy to create the flow???.. Download my template here References NiFi REST API NiFI Expression Language Thanks, Jobin George
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- error
- FAQ
- hdf-2.0.0
- memory
- monitoring
- NiFi
Labels:
05-03-2017
01:09 PM
Thanks Josh. removing and re-installing zeppelin did the trick. missed to accept the answer.
... View more
05-02-2017
02:41 PM
@Laurence Da Luz That might be a little outdated. I have an updated Article for HDF-2.x and configuring LDAP via Ambari. https://community.hortonworks.com/articles/79627/hdf-20-ldap-user-authentication-with-ambari.html
... View more
03-18-2017
07:08 PM
Hi @Sunile Manjee, I have a storm example in the below article: https://community.hortonworks.com/articles/79597/nifi-site-to-site-direct-streaming-to-storm.html But I used storm-spout. pls find the sample code here: https://github.com/jobinthompu/NiFi-Storm-Integration/blob/master/src/main/java/NiFi/NiFiStormStreaming.java See if this helps. Thanks, Jobin
... View more
03-14-2017
06:33 PM
1 Kudo
Hi @Sunile Manjee, Since you are using NiFi to launch jobs, why don't you use NiFi itself to monitor it 😛 I tried to flex NiFi to monitor Yarn jobs by querying ResourceManager , and have documented it and my flow xml is attached in the comments. check it out. https://community.hortonworks.com/content/kbentry/42995/yarn-application-monitoring-with-nifi.html In the demo I used it to monitor Failed and Killed jobs only, you can change the query and ask for all the jobs say user smanjee submitted and alert you as soon as its completed/failed/killed. Thanks Jobin
... View more
02-24-2017
09:46 PM
@Matt Clarke G1GC was set as garbage collector and Issue was fixed in the next version of HDF. This one went unnoticed, accepting the answer. Thanks, Jobin George
... View more
02-21-2017
06:13 PM
1 Kudo
Hi @Josh Elser, I was wondering why there could be an invalid version of Hbase Jar in the class path of a Sanbox, I havnt changed any configs. Also what is the best approach to fix this? Thanks, Jobin
... View more
02-21-2017
06:33 AM
3 Kudos
Introduction Using NiFi, data can be exposed in such a way that a receiver can pull from it by adding an Output Port to the root process group.
For Spark, we will use this same mechanism - we will use the Site-to-Site protocol to pull data from NiFi's Output Ports. In this tutorial we learn to capture NiFi app log from the Sandbox and parse it using Java regex and ingest it to Phoenix via Spark or Directly using NiFi PutSql Processor. Prerequisites 1) Assuming you already have latest version of NiFi-1.x/HDF-2.x downloaded as zip file (HDF and HDP cannot be managed by Ambari on same nodes as of now) on to your HW Sandbox Version 2.5, else execute below after ssh connectivity to sandbox is established: # mkdir /opt/HDF-2.1.1
# cd /opt/HDF-2.1.1 # wget http://public-repo-1.hortonworks.com/HDF/2.1.1.0/nifi-1.1.0.2.1.1.0-2-bin.tar.gz
# tar -xvf nifi-1.1.0.2.1.1.0-2-bin.tar.gz 2) Spark, Zeppelin, YARN and HDFS are Installed on your VM and started. 3) Hbase is Installed with phoeix Query Server. 4) Download Compatible version [in our case 1.1.0] of "nifi-spark-receiver" and "nifi-site-to-site-client" to Sandbox in a specific location: # mkdir /opt/spark-receiver
# cd /opt/spark-receiver
# wget http://central.maven.org/maven2/org/apache/nifi/nifi-spark-receiver/1.1.0/nifi-spark-receiver-1.1.0.jar
# wget http://central.maven.org/maven2/org/apache/nifi/nifi-site-to-site-client/1.1.0/nifi-site-to-site-client-1.1.0.jar 5) Make sure Git is installed on the VM: # yum install git -y Configuring and Creating Table in Hbase via Phoenix 1) Make sure Hbase components as well as phoenix query server is started.
2) Make sure Hbase is up and running and out of maintenance mode, below properties are set(if not set it and restart the services): - Enable Phoenix --> Enabled
- Enable Authorization --> Off 3) Create Phoenix Table after connecting to phoenix shell (or via Zeppelin): # /usr/hdp/current/phoenix-client/bin/sqlline.py sandbox.hortonworks.com:2181:/hbase-unsecure 4) Execute below in the Phoenix shell to create tables in Hbase: CREATE TABLE NIFI_LOG( UUID VARCHAR NOT NULL, EVENT_DATE VARCHAR, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
CREATE TABLE NIFI_DIRECT( UUID VARCHAR NOT NULL, EVENT_DATE VARCHAR, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID)); Configuring and Restarting Spark 1) Login to Ambari UI and Navigate to Services --> Spark --> Configs --> Custom spark-defaults and add 2 below properties with given values: spark.driver.extraClassPath = /opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/lib/nifi-framework-api-1.1.0.2.1.1.0-2.jar:/opt/spark-receiver/nifi-site-to-site-client-1.1.0.jar:/opt/spark-receiver/nifi-spark-receiver-1.1.0.jar:/opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/lib/nifi-api-1.1.0.2.1.1.0-2.jar:/opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/lib/bootstrap/nifi-utils-1.1.0.2.1.1.0-2.jar:/opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/work/nar/framework/nifi-framework-nar-1.1.0.2.1.1.0-2.nar-unpacked/META-INF/bundled-dependencies/nifi-client-dto-1.1.0.2.1.1.0-2.jar:/opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/work/nar/framework/nifi-framework-nar-1.1.0.2.1.1.0-2.nar-unpacked/META-INF/bundled-dependencies/httpcore-nio-4.4.5.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar
spark.driver.allowMultipleContexts = true 2) Once properties are add, restart Spark. Configuring and Starting NiFi 1) Open nifi.properties for updating configurations: # vi /opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/conf/nifi.properties 2) Change NIFI http port to run on 9090 as default 8080 will conflict with Ambari web UI # web properties #
nifi.web.http.port=9090 3) Configure NiFi instance to run site-to site by changing below configuration : add a port say 8055 and set "nifi.remote.input.secure" as "false" # Site to Site properties #
nifi.remote.input.socket.port=8055
nifi.remote.input.secure=false 4) Now Start [Restart if already running for configuration change to take effect] NiFi on your Sandbox. # /opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/bin/nifi.sh start 5) Make sure NiFi is up and running by connecting to its Web UI from your browser: http://your-vm-ip:9090/nifi/ Building a Flow in NiFi to fetch and parse nifi-app.log 1) Let us build a small flow on NiFi canvas to read app log generated by NiFi itself to feed to Spark:
2) Drop a "TailFile" Processor to canvas to read lines added to "/opt/HDF-2.1.1/nifi-1.1.0.2.1.1.0-2/logs/nifi-app.log". Auto Terminate relationship Failure. 3) Drop a "SplitText" Processor to canvas to split the log file into separate lines. Auto terminate Original and Failure Relationship for now. Connect TailFile processor to SplitText Processor for Success Relationship. 4) Drop a "ExtractText" Processor to canvas to extract portions of the log content to attributes as below. Connect SplitText processor to ExtractText Processor for splits relationship. - BULLETIN_LEVEL:([A-Z]{4,5})
- CONTENT:(^.*)
- EVENT_DATE:([^,]*)
- EVENT_TYPE:(?<=\[)(.*?)(?=\]) 5) Drop an OutputPort to the canvas and Name it "spark", Once added, connect "ExtractText" to the port for matched relationship. The Flow would look similar as below: 6) Start the flow on NiFi and notice data is stuck in the connection before the output port "spark" Building Spark application 1) To begin with, lets clone the git repo below: # cd /opt/
# git clone https://github.com/jobinthompu/NiFi-Spark-Feeding-Data-to-Spark-Streaming.git 2) Feel free the inspect Spark application code: # vi /opt/NiFi-Spark-Feeding-Data-to-Spark-Streaming/src/main/Spark+NiFi+Phoenix.sh 3) Now let us go ahead and submit the Spark Application to YARN or can run locally via spark-shell # spark-shell --master yarn --deploy-mode client -i /opt/NiFi-Spark-Feeding-Data-to-Spark-Streaming/src/main/Spark+NiFi+Phoenix.sh OR # spark-shell -i /opt/NiFi-Spark-Feeding-Data-to-Spark-Streaming/src/main/Spark+NiFi+Phoenix.sh 4) Make sure the application is submitted and it prints out statistics.
5) Lets Go ahead and verify that the Application is submitted and started in YARN (you can drill down and see the Application-Master spark UI as well): YARN UI: http://your-vm-ip:8088 Or if you Submit the application locally you can verify that by accessing spark shell application UI: http://sandbox.hortonworks.com:4040/executors/ 6) Lets Go back to the NiFi Web UI, if everything worked fine, the data which was pending on the port 'spark' will be gone as it was consumed by Spark. 7) Now Lets Connect to Phoenix and check out the data populated in tables, you can either use Phoenix sqlline command line or Zeppelin: - via phoenix sqlline # /usr/hdp/current/phoenix-client/bin/sqlline.py localhost:2181:/hbase-unsecure
SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_LOG WHERE BULLETIN_LEVEL='INFO' ORDER BY EVENT_DATE LIMIT 20; - via Zeppelin for better visualization Zeppelin UI: http://your-vm-ip:9995/ Extending NiFi Flow to ingest data directly to Phoenix using PutSql processor 1) Lets go ahead and kill the Spark Application by pressing cntrl+c from command-line: 2) Log back to NiFi UI currently running the flow, and stop the entire flow. 3) Drop a RouteOnAttribute processor to canvas for Matched relation from ExtractText processor and configure it with below property and auto terminate unmatched relation. DEBUG : ${BULLETIN_LEVEL:equals('DEBUG')}
ERROR : ${BULLETIN_LEVEL:equals('ERROR')}
INFO : ${BULLETIN_LEVEL:equals('INFO')}
WARN : ${BULLETIN_LEVEL:equals('WARN')} 4) Drop an AttributesToJSON processor to canvas with below configuration and connect RouteOnAttribute's DEBUG,ERROR,INFO,DEBUG relations to it. Attributes List : uuid,EVENT_DATE,BULLETIN_LEVEL,EVENT_TYPE,CONTENT
Destination : flowfile-content 5) Create and enable DBCPConnectionPool with name "Phoenix-Spark" with below configuration: Database Connection URL : jdbc:phoenix:sandbox.hortonworks.com:2181:/hbase-unsecure
Database Driver Class Name : org.apache.phoenix.jdbc.PhoenixDriver
Database Driver Location(s) : /usr/hdp/current/phoenix-client/phoenix-client.jar 6) Drop a ConvertJSONToSQL to canvas with below configuration, connect AttributesToJSON's success relation to it, auto terminate Failure relation for now after connecting to Phoenix-Spark DB Controller service. 7) Drop a ReplaceText processor canvas to update INSERT statements to UPSERT for Phoenix with below configuration, connect sql relation of ConvertJSONToSQL auto terminate original and Failure relation. 😎 Finally add a PutSQL processor with below configurations and connect it to ReplaceText's success relation and auto terminate all of its relations. 9) The final flow including both ingestion via Spark and direct to phoenix using PutSql is complete, it should look similar to below: 10) Now go ahead and start the flow to ingest data to both Tables via Spark and directly from NiFi. 11) Login back to Zeppelin to see if data is populated in the NIFI_DIRECT table. %jdbc(phoenix)
SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='INFO' ORDER BY EVENT_DATE Too Lazy to create flow??? download my flow template here
This completes the tutorial, You have successfully: - Installed and Configured HDF 2.1 on your HDP-2.5 Sandbox. - Created a Data flow to pull logs and then to Parse it and make it available on a Site-to-site enabled NiFi port. - Created a Spark Application to consume data from NiFi via Site-to-Site and Ingest it to Hbase via Phoenix. - Directly Ingested Data to Phoenix with PutSQL Processor in NiFi with out using Spark. - Viewed the Ingested data from Phoenix command line and Zeppelin References: Mark Payne's - NiFi-Spark My GitHub Article Thanks, Jobin George
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- hdf
- How-ToTutorial
- NiFi
- nifi-streaming
- Phoenix
- Spark
- streaming
02-19-2017
04:48 PM
2 Kudos
Hi, While Trying to execute phoenix queries via zeppelin in HDP-2.5 Sandbox, I am getting below NoSuchMethodError. I enabled Phoenix and turned Auth to simple in Hbase conf. Please let me know if I am missing something here (ranger is disabled and not Kerberized): 04:50:05.006 [pool-3-thread-2] ERROR org.apache.zeppelin.jdbc.security.JDBCSecurityImpl - Invalid auth.type detected with value , defaulting auth.type to SIMPLE
org.apache.phoenix.exception.PhoenixIOException: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.protobuf.generated.MasterProtos$ModifyTableRequest$Builder.setNonceGroup(J)Lorg/apache/hadoop/hbase/protobuf/generated/MasterProtos$ModifyTableRequest$Builder;
at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:111) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1063) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1369) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2116) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:828) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:183) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:338) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:326) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:324) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1326) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2436) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl$13.call(ConnectionQueryServicesImpl.java:2248) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:78) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2248) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:233) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:135) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:202) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at java.sql.DriverManager.getConnection(DriverManager.java:664) ~[?:1.8.0_111]
at java.sql.DriverManager.getConnection(DriverManager.java:208) ~[?:1.8.0_111]
at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:257) ~[zeppelin-jdbc-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.jdbc.JDBCInterpreter.getStatement(JDBCInterpreter.java:275) ~[zeppelin-jdbc-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:336) [zeppelin-jdbc-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:442) [zeppelin-jdbc-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) [zeppelin-interpreter-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:341) [zeppelin-interpreter-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.scheduler.Job.run(Job.java:176) [zeppelin-interpreter-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) [zeppelin-interpreter-0.6.0.2.5.0.0-1245.jar:0.6.0.2.5.0.0-1245]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_111]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.protobuf.generated.MasterProtos$ModifyTableRequest$Builder.setNonceGroup(J)Lorg/apache/hadoop/hbase/protobuf/generated/MasterProtos$ModifyTableRequest$Builder;
at org.apache.hadoop.hbase.client.RpcRetryingCaller.translateException(RpcRetryingCaller.java:229) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:140) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4036) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.modifyTable(HBaseAdmin.java:2548) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.modifyTable(HBaseAdmin.java:2561) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.modifyTable(ConnectionQueryServicesImpl.java:1086) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1058) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
... 33 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.protobuf.generated.MasterProtos$ModifyTableRequest$Builder.setNonceGroup(J)Lorg/apache/hadoop/hbase/protobuf/generated/MasterProtos$ModifyTableRequest$Builder;
at org.apache.hadoop.hbase.protobuf.RequestConverter.buildModifyTableRequest(RequestConverter.java:1298) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin$25.call(HBaseAdmin.java:2551) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin$25.call(HBaseAdmin.java:2548) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4036) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.modifyTable(HBaseAdmin.java:2548) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.hadoop.hbase.client.HBaseAdmin.modifyTable(HBaseAdmin.java:2561) ~[hbase-client-1.1.3.jar:1.1.3]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.modifyTable(ConnectionQueryServicesImpl.java:1086) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1058) ~[phoenix-core-4.7.0-HBase-1.1.jar:4.7.0-HBase-1.1]
... 33 more
Thanks, Jobin George
... View more
Labels:
02-16-2017
08:06 PM
1 Kudo
Hi @Andrew Grande , Thanks for the response, I have seen this when threads are set more than one for some processors and these settings worked so far exactly when funnel is not involved. thats why It is confusing.
... View more
02-16-2017
07:46 PM
3 Kudos
Hi, I have a small flow created to test priority, but while configuring i noticed that sometimes Back Pressure(Object Threshold) set is not honored per connection when I have a funnel in it.
Below is a flow I created, every connection except final connection have default Back Pressure Settings(even Concurrent Tasks is default 1), Last connection out of funnel have Object threshold Set to 10. When I start all the processors togather except the last LogAttribute processor in the flow, I see Back Pressure Threshold is exceeded in at least 2 connections each time I tested. Am I missing something or is there a logical explanation?
Attaching My flow back-pressure.xml Version: [HDF-2.0.1] [NiFi-1.0.0.2.0.1.0-12] Thanks, Jobin George
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
02-16-2017
03:54 AM
1 Kudo
Hi Harsh, Cant make out much from the screenshot as the main error cause is not visible. First glance, it looks more like environment issue.
... View more
02-15-2017
07:29 AM
4 Kudos
Introduction Recently a customer asked me how to change destination of a connection which still contains data, but the destination is stopped. Using NiFi REST Api we can change the flow, here in this article I am trying to capture steps to update destination of a connection using REST API. The requirement around this was to push incoming data to different flows on a timely manner. Prerequisites 1) To test this, Make sure HDF-2.x version of NiFi is up an running 2) Minimum 3 processors are on the canvas with connection like below: 3) Note the destination's uuid [In my case 'PutNext' processor's uuid] 4) Note the uuid of the connection that has to be made to the 'PutNext' processor 'GET'ing the details of connection 1) Execute the below command on your terminal with uuid of the connection: curl -i -X GET http://localhost:8080/nifi-api/connections/dcbee9dd-0159-1000-45a7-8306c28f2786 2) Now Copy the result of the GET curl command and update the below section with uuid of the 'PutNext' processor. "destination":{"id":
"destinationId":"
"destinationId": 3) Remove the below json element from the copied result: "permissions":{"canRead":true,"canWrite":true}, Updating Destination with PUT REST calls 1) Once updated, run the PUT REST API call from command line using curl as below: curl -i -X PUT -H 'Content-Type: application/json' -d '{ **MY UPDATED JSON**}' http://localhost:8080/nifi-api/connections/dcbee9dd-0159-1000-45a7-8306c28f2786 My Sample command is as below: curl -i -X PUT -H 'Content-Type: application/json' -d '{
"revision": {
"clientId": "dd1c2f03-0159-1000-845b-d5c732a49869",
"version": 15
},
"id": "dcbee9dd-0159-1000-45a7-8306c28f2786",
"uri": "http://localhost:8080/nifi-api/connections/dcbee9dd-0159-1000-45a7-8306c28f2786",
"component": {
"id": "dcbee9dd-0159-1000-45a7-8306c28f2786",
"parentGroupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"source": {
"id": "dcbea89f-0159-1000-278c-cc38bab689bf",
"type": "PROCESSOR",
"groupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"name": "GenerateFlowFile",
"running": false,
"comments": ""
},
"destination": {
"id": "dcbebd86-0159-1000-7559-d77d1e05c910",
"type": "PROCESSOR",
"groupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"name": "PutFile",
"running": false,
"comments": ""
},
"name": "",
"labelIndex": 1,
"zIndex": 0,
"selectedRelationships": ["success"],
"availableRelationships": ["success"],
"backPressureObjectThreshold": 10000,
"backPressureDataSizeThreshold": "1 GB",
"flowFileExpiration": "0 sec",
"prioritizers": [],
"bends": []
},
"status": {
"id": "dcbee9dd-0159-1000-45a7-8306c28f2786",
"groupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"name": "success",
"statsLastRefreshed": "09:02:22 EST",
"sourceId": "dcbea89f-0159-1000-278c-cc38bab689bf",
"sourceName": "GenerateFlowFile",
"destinationId": "dcbebd86-0159-1000-7559-d77d1e05c910",
"destinationName": "PutFile",
"aggregateSnapshot": {
"id": "dcbee9dd-0159-1000-45a7-8306c28f2786",
"groupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"name": "success",
"sourceName": "GenerateFlowFile",
"destinationName": "PutFile",
"flowFilesIn": 0,
"bytesIn": 0,
"input": "0 (0 bytes)",
"flowFilesOut": 0,
"bytesOut": 0,
"output": "0 (0 bytes)",
"flowFilesQueued": 18,
"bytesQueued": 18,
"queued": "18 (18 bytes)",
"queuedSize": "18 bytes",
"queuedCount": "18"
}
},
"bends": [],
"labelIndex": 1,
"zIndex": 0,
"sourceId": "dcbea89f-0159-1000-278c-cc38bab689bf",
"sourceGroupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"sourceType": "PROCESSOR",
"destinationId": "dcbebd86-0159-1000-7559-d77d1e05c910",
"destinationGroupId": "cbe6e53b-0158-1000-e36a-f9d26bb1b510",
"destinationType": "PROCESSOR"
}' http://localhost:8080/nifi-api/connections/dcbee9dd-0159-1000-45a7-8306c28f2786
2) Once you execute the above, if the update is successful, you will get below result: HTTP/1.1 200 OK
Date: Fri, 27 Jan 2017 14:59:37 GMT
Cache-Control: private, no-cache, no-store, no-transform
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(9.3.9.v20160517) Result will be similar as below: 3) Now login Back to the NiFi UI and make sure the change is done: References:
NiFi API My github Thanks, Jobin George
... View more
- Find more articles tagged with:
- api
- connection
- Data Ingestion & Streaming
- hdf
- How-ToTutorial
- NiFi
- update
Labels:
02-15-2017
06:50 AM
7 Kudos
Introduction Recently I was asked how to monitor and alert flowfile count in a connection queue when it exceeds a predefined threshold, I am trying to capture the steps to implement the same. Prerequisites 1) To test this, Make sure HDF-2.x version of NiFi is up an running. 2) You Already have a connection with data queued in it(say more than 20 flowfiles). Else you can create one like below: 3) Make a note of the Connection name and uuid to be monitored: Creating a Flow to Monitor Connection Queue Count. 1) Drop a GenerateFlowFile processor to the canvas and make "Run Schedule" 60 sec so we dont execute the flow to often. 2) Drop an UpdateAttribute processor, connect GenerateFlowFile's success relation to it and add below properties to it( the connection uuid noted above, threshold say 20, your NiFi host and port): CONNECTION_UUID : dcbee9dd-0159-1000-45a7-8306c28f2786
COUNT : 20
NIFI_HOST : localhost
NIFI_PORT : 8080 3) Drop a InvokeHTTP processor to the canvas, connect UpdateAttribute's success relation to it, auto terminate all other relations and update its 2 properties as below: HTTP Method : GET
Remote URL : http://${NIFI_HOST}:${NIFI_PORT}/nifi-api/connections/${CONNECTION_UUID} 4) Drop an EvaluateJsonPath processor to extract values from json with below properties, connect Response relation of InvokeHTTP to it, and auto terminate its failure and unmatched relations. QUEUE_NAME : $.status.name
QUEUE_SIZE : $.status.aggregateSnapshot.flowFilesQueued 5) Drop a RouteOnAttribute processor to the canvas with below configs, connect EvaluateJsonPath's matched relation to it and auto terminate its unmatched relation. Queue_Size_Exceeded : ${QUEUE_SIZE:gt(${COUNT})} 6) Lastly add a PutEmail processor, connect RouteOnAttribute's matched relation to it and auto terminate all its relations. below are my properties set, you have to update it with your SMTP details: SMTP Hostname : west.xxxx.yourServer.net
SMTP Port : 587
SMTP Username : jgeorge@hortonworks.com
SMTP Password : Its_myPassw0rd_updateY0urs
SMTP TLS : true
From : jgeorge@hortonworks.com
To : jgeorge@hortonworks.com
Subject : Queue Size Exceeded Threshold and message content should look something like below to grab all the values: Message : Queue Size Exceeded Threshold for ${CONNECTION_UUID} Connection Name : ${QUEUE_NAME}
Threshold Set : ${COUNT}
Current FlowFile Count : ${QUEUE_SIZE} 7) Now the flow is completed and li should look similar to below: Staring the flow and testing it 1) Lets make sure at least 21 flow files are pending in the connection named 'DataToFileSystem' which was created in the Prerequisites 2) Now lets start the flow and you should receive mail alert from NiFi stating the count exceeded Threshold set which is 20 in our case. My sample alert looks like below: 3) This concludes the tutorial for monitoring your connection queue count with NiFi. 4) Too lazy to create the flow???.. Download my template here References NiFi REST API NiFI Expression Language Thanks, Jobin George
... View more
- Find more articles tagged with:
- alerts
- api
- Data Ingestion & Streaming
- hdf
- How-ToTutorial
- monitoring
- NiFi
Labels:
02-13-2017
07:25 AM
4 Kudos
Hi @Avijeet Dash, See if this helps: One easy way of loading key value pairs in NiFi is using NiFi Custom Properties registry. This is a comma-separated list of file location paths for one or more custom property files. For example I can load a file named nifi_registry with key value pairs separated using '=' (say i have OS=MAC in it), and can reference ${OS} to substitute its value MAC using NiFi expression language(after restarting nifi). Its updated in nifi.property file using nifi.variable.registry.properties You can read about it here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#custom_properties Thanks, Jobin
... View more
02-04-2017
04:52 PM
Hi @Saurabh Verma, - Looks like 'NiFi Certificate Authority' is still using the Ambari provided java rather than what you provide in custom configs as well in "Template for nifi-env.sh" [but if you try just starting NiFi individually rather than trying whole service with above provided update NiFi should come up, but not NiFi-CA]. - Do you think Updating java version for Ambari is an option? (hoping it doesn't break anything else). if so, please follow the link below and choose option 3 [Custom JDK] in step 3 and enter your new java home location. https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_ambari_reference_guide/content/ch_changing_the_jdk_version_on_an_existing_cluster.html Let me know if this helps.
... View more
02-03-2017
11:18 PM
2 Kudos
Hi @Saurabh Verma If I understand you correctly, below would help: If you have Java version conflict, and already have latest version installed. Please try updating java property with full path of your java8 binary under services->NiFi->Advanced nifi-bootstrap-env in Ambari as below: Update to like below with your full java 8 path: Once updated, restart the service and see if you still have issues. if not using Ambari, update below section in./conf/bootstrap.conf file. # Java command to use when running NiFi java=java
... View more
01-31-2017
06:37 PM
5 Kudos
Introduction When the NiFi bootstrap starts or stops NiFi, or detects that it has died unexpectedly, it is able to notify configured recipients. Currently, the only mechanism supplied is to send an e-mail notification. Prerequisite 1) Assuming you already have HDF-2.x Installed, Ambari and NiFi is up and running. If not, I would recommend "Ease of Deployment" section of this article to install it [You can also follow this article for Automated installation of HDF cluster or refer hortonworks.com for detailed steps] Configuring NiFi property files in Ambari 1) To setup email notifications we have to
update only two configurations file bootstrap.conf
and bootstrap-notification-services.xml
2) We have to update appropriate properties in
Ambari to configure it, first we have to edit Template for bootstrap.conf to update below properties. Uncomment
below lines in the properties file: nifi.start.notification.services=email-notification
nifi.stop.notification.services=email-notification
nifi.dead.notification.services=email-notification 3) Edit Template for bootstrap-notification-services.xml and make sure your SMTP
settings are updated, and are uncommented. Sample configuration is given below: <service>
<id>email-notification</id>
<class>org.apache.nifi.bootstrap.notification.email.EmailNotificationService</class>
<property name="SMTP Hostname">west.xxxx.server.net</property>
<property name="SMTP Port">587</property>
<property name="SMTP Username">jgeorge@hortonworks.com</property>
<property name="SMTP Password">Th1sisn0tmypassw0rd</property>
<property name="SMTP TLS">true</property>
<property name="From">jgeorge@hortonworks</property>
<property name="To">jgeorge@hortonworks.com</property>
</service> 4) Save the Config changes in Ambari after
uncommenting the, <service> property, confirm when asked and restart
service. Testing NiFi Notification Services 1) Once restarted, you will see both stopped and started alerts in your inbox with details. Stopped Email Alert: Started Email Alert: 2) Try out stopping and killing NiFi
process [make sure you don’t kill bootstrap process which monitors NiFi which
in turn restarts NiFi process.] Died Email Alert: References NiFi notification_services Thanks, Jobin George
... View more
- Find more articles tagged with:
- Ambari
- Data Ingestion & Streaming
- hdf
- How-ToTutorial
- monitoring
- NiFi
- notifications
01-25-2017
09:48 PM
4 Kudos
Introduction ControllerStatusReportingTask Logs the 5-minute stats that are shown in the NiFi Summary Page for Processors and Connections. By Default, when configured and started it goes directly to nifi-app.log. These can be configured in the NiFi logging configuration to log to different files, here I try to describe steps to log it to a separate log file with Ambari. Prerequisite 1) Assuming you already have HDF-2.x Installed on your VM/Server, Ambari, NiFi is up and running. If not, I would recommend "Ease of Deployment" section of this article to install it [You can also follow this article for Automated installation of HDF cluster or refer hortonworks.com for detailed steps] Configuring "Advanced nifi-node-logback-env" section in Ambari 1. Navigate to your
browser window and type in URL for Ambari as below and login to Ambari UI [UI is
accessible at port 8080] http://<YOUR_IP>:8080/ 2. Once logged
in click on NiFi service option on left side, click on “Configs” and expand “Advanced
nifi-node-logback-env” section in configs and edit “logback.xml” template
and add below lines just before last line </configuration> <appender name="5MINUTES_FILE">
<file>${org.apache.nifi.bootstrap.config.log.dir}/5minutesStatistics.log</file>
<rollingPolicy>
<fileNamePattern>${org.apache.nifi.bootstrap.config.log.dir}/5minutesStatistics_%d.log</fileNamePattern>
<maxHistory>5</maxHistory>
</rollingPolicy>
<encoder>
<pattern>%date %level [%thread] %logger{40} %msg%n</pattern>
</encoder>
</appender>
<logger name="org.apache.nifi.controller.ControllerStatusReportingTask" level="INFO" additivity="false">
<appender-ref ref="5MINUTES_FILE" />
</logger> 3. Once above lines for
creating a new file named 5minutesStatistics.log
to save all the 5Minutes statistics details the Reporting task creates,
click save and enter details of what configuration is changed. 4. Once saved, Ambari
will suggest restart of NiFi service, click restart (It might take up to 2 minutes
to complete restart and NiFi UI to come online) Configuring ControllerStatusReportingTask in NiFi 1. Once NiFi is restarted, navigate
to NiFi User Interface on any node and click on the ‘Controller Settings’ tab
on right top corner. A window like below will popup. Select “Reporting Tasks”
tab Click on the ‘+’ on right corner, when a selection is requested click on ‘ControllerStatusReportingTask’ 2.
Once selected click add, the Reporting task will be in stopped state, Now you may click start: 3. Once started, the 5minutes statistics would have started logging onto 5minutesStatistics.log Verifying Log created in the Server 1.
ControllerStatusReportingTask will log the 5-minute stats that are shown in the NiFi Summary Page for
Processors and Connections to 5minutesStatistics.log. Let’s verify the same: # tail -f /var/log/nifi/5minutesStatistics.log 2.
ControllerStatusReportingTask started logging the 5-min processor status to the
log we specified and it will be rolled as per the configuration we provided. 3.
Once verified, stop the Controller service. Hope this Helps.. Thanks, Jobin George
... View more
- Find more articles tagged with:
- Ambari
- Cloudbreak
- Data Ingestion & Streaming
- hdf
- How-ToTutorial
- monitoring
- NiFi
01-25-2017
06:49 PM
@Yumin Dong Which version of HDF/NiFi are you using? if the latest one, I hope you downloaded the latest version of dependencies. Let me know. Jobin
... View more
01-25-2017
04:28 AM
6 Kudos
Introduction By integrating with LDAP, username/password authentication can be enabled in NiFi. This tutorial provides step by step instructions to setup NiFi - LDAP Authentication via Ambari (Using Knox Demo Ldap Server) Prerequisite 1) Assuming you already have HDF-2.x Installed on your VM/Server, Ambari, NiFi is up and running with out security. If not, I would recommend "Ease of Deployment" section of this article to install it [You can also follow this article for Automated installation of HDF cluster or refer hortonworks.com for detailed steps] Setting up Demo LDAP Server 1) As HDF and HDP cannot co-exist on a single node, lets download knox zip file from apache for this tutorial for easily setting up an ldap server. Execute below steps for the same after establishing ssh connectivity to the VM/Server (name of my host is node1): # ssh node1
# mkdir /opt/knox/
# cd /opt/knox/
# wget http://mirror.cogentco.com/pub/apache/knox/0.11.0/knox-0.11.0.zip
# unzip knox-0.11.0.zip
# /opt/knox/knox-0.11.0/bin/ldap.sh start 2) Make sure ldap server is started and running on port 33389 on your server # lsof -i:33389
OR
# netstat -anp | grep 33389 3) Below credentials are part of knox demo ldap we just started. We can use any of the users to login to NiFi after integration. tom/tom-password
admin/admin-password
sam/sam-password
guest/guest-password Configuring NiFi for LDAP Authentication via Ambari 1. Login to Ambari UI in the server URL, Click on the NiFi service à and then click on Config tab, expand “Advanced nifi-ambari-ssl-config ” section, update configuration as below: Initial Admin Identity : uid=admin,ou=people,dc=hadoop,dc=apache,dc=org
Enable SSL? : {click check box}
Key password : hadoop
Keystore password : hadoop
Keystore type : JKS 2. Enter Below as the Truststore and DN configurations : Truststore password : hadoop
Truststore type : JKS
NiFi CA DN prefix : CN=
NiFi CA DN suffix : , OU=NIFI 3. Provide the configuration as below for Node Identities and keystore details: NiFi CA Force Regenerate? : {click check box}
NiFi CA Token : hadoop
Node Identities :
<property name="Node Identity 1">CN=node1, OU=NIFI</property>
Tip: Say If I am having a 3 node cluster with node1, node2 and node3 as part f it, above configuration looks like below: <property name="Node Identity 1">CN=node1, OU=NIFI</property>
<property name="Node Identity 2">CN=node2, OU=NIFI</property>
<property name="Node Identity 3">CN=node3, OU=NIFI</property> 4. In the Ambari UI, choose NiFi service and select config tab. We have to update two set of properties, in the “Advanced nifi-properties ” section update nifi.security.user.login.identity.provider as ldap-provider nifi.security.user.login.identity.provider=ldap-provider
5. Now in the “Advanced nifi-login-identity-providers-env ” section, update the “Template for loginidentity- providers.xml “ property with below configurations just above </loginIdentityProviders> <provider>
<identifier>ldap-provider</identifier>
<class>org.apache.nifi.ldap.LdapProvider</class>
<property name="Authentication Strategy">SIMPLE</property>
<property name="Manager DN">uid=admin,ou=people,dc=hadoop,dc=apache,dc=org</property>
<property name="Manager Password">admin-password</property>
<property name="TLS - Keystore">/usr/hdf/current/nifi/conf/keystore.jks</property>
<property name="TLS - Keystore Password">hadoop</property>
<property name="TLS - Keystore Type">JKS</property>
<property name="TLS - Truststore">/usr/hdf/current/nifi/conf/truststore.jks</property>
<property name="TLS - Truststore Password">hadoop</property>
<property name="TLS - Truststore Type">JKS</property>
<property name="TLS - Client Auth"></property>
<property name="TLS - Protocol">TLS</property>
<property name="TLS - Shutdown Gracefully"></property>
<property name="Referral Strategy">FOLLOW</property>
<property name="Connect Timeout">10 secs</property>
<property name="Read Timeout">10 secs</property>
<property name="Url">ldap://node1:33389</property>
<property name="User Search Base">ou=people,dc=hadoop,dc=apache,dc=org</property>
<property name="User Search Filter">uid={0}</property>
<property name="Authentication Expiration">12 hours</property>
</provider> 6. Once All properties are updated, click save and when prompted, click restart. 7. Once restarted, you can try connecting to nifi URL, you should see the below screen, enter credentials as below for admin user the configured Initial Admin Identity and click LOG IN https://node1:9091/nifi/ --> in my case host is node1
admin/admin-password 8. You should be able to login as Admin user for NiFi and should see the below UI: Adding a User and Providing Access to UI 1) Let us go ahead and create a user jobin in ldap so that we can give access for him to NiFi UI. 2) Edit the users.ldif file with below entry in the knox/conf directory and restart the server: # vi /opt/knox/knox-0.11.0/conf/users.ldif Add below entry to the end of the file # entry for sample user jobin
dn: uid=jobin,ou=people,dc=hadoop,dc=apache,dc=org
objectclass:top
objectclass:person
objectclass:organizationalPerson
objectclass:inetOrgPerson
cn: jobin
sn: jobin
uid: jobin
userPassword:jobin-password 3) Once added lets stop and start the ldap server: # /opt/knox/knox-0.11.0/bin/ldap.sh stop
# /opt/knox/knox-0.11.0/bin/ldap.sh start 4) While logged in as admin on the nifi UI, Lets us add a user jobin with below id by clicking '+ user' button on top right 'users' menu like below: uid=jobin,ou=people,dc=hadoop,dc=apache,dc=org Enter the above value and click OK. 5. Now close the users window and click to open 'policies' window on the management menu on the top right corner below 'users' menu. click "+user" button on right top corner, on the pop up, enter jobin and select the user and click OK. 6. Once policy added, it will look like below: 7. Now you may log out as admin and provide below credentials to login as 'jobin' user, jobin/jobin-password 8. you should be able to login and view the UI, but wont have privilege to add anything to the canvas. (as jobin is given only read access) you may login back as admin and give required access.
This completes the tutorial, You have successfully: - Installed and Configured HDF 2.0 on your server. - Downloaded and started knox Demo Ldap Server - Configured NiFi to use Knox Ldap to Authenticate users where NiFi Initial Admin is from Ldap. - Restarted NiFi and verified access for admin user in NiFi UI. - Created a new user jobin in ldap, added him to NiFi user list and gave read access. - Verified access for user jobin Thanks, Jobin George
... View more
- Find more articles tagged with:
- Ambari
- hdf
- how-to-tutorial
- How-ToTutorial
- Knox
- LDAP
- NiFi
- Security
01-25-2017
12:12 AM
4 Kudos
Introduction
Using NiFi, data can be exposed in such a way that a receiver can pull from it by adding an Output Port to the root process group. For Storm, we will use this same mechanism - we will use the Site-to-Site protocol to pull data from NiFi's Output Ports. In this tutorial we learn to capture NiFi app log from the Sandbox and parse it using Java regex and ingest it to Phoenix via Storm or Directly using NiFi PutSql Processor.
Prerequisites 1) Assuming you already have latest version of NiFi-1.x/HDF-2.x downloaded as zip file (HDF and HDP cannot be managed by Ambari on same nodes as of now) on to your HW Sandbox Version 2.5, else execute below after ssh connectivity to sandbox is established: # cd /opt/
# wget http://public-repo-1.hortonworks.com.s3.amazonaws.com/HDF/centos6/2.x/updates/2.0.1.0/HDF-2.0.1.0-centos6-tars-tarball.tar.gz # tar -xvf HDF-2.0.1.0-12.tar.gz 2) Storm, Zeppelin are Installed on your VM and started. 3) Hbase is Installed with phoeix Query Server. 4) Make sure Maven is installed, if not already, execute below steps: # curl -o /etc/yum.repos.d/epel-apache-maven.repo https://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo
# yum -y install apache-maven
# mvn -version
Configuring and Creating Table in Hbase via Phoenix 1) Make sure Hbase components as well as phoenix query server is started. 2) Make sure Hbase is up and running and out of maintenance mode, below properties are set(if not set it and restart the services): - Enable Phoenix --> Enabled
- Enable Authorization --> Off 3) Create Phoenix Table after connecting to phoenix shell (or via Zeppelin): # /usr/hdp/current/phoenix-client/bin/sqlline.py sandbox.hortonworks.com:2181:/hbase-unsecure 4) Execute below in the Phoenix shell to create tables in Hbase: CREATE TABLE NIFI_LOG( UUID VARCHAR NOT NULL, EVENT_DATE DATE, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
CREATE TABLE NIFI_DIRECT( UUID VARCHAR NOT NULL, EVENT_DATE VARCHAR, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
Configuring and Starting NiFi 1) Open nifi.properties for updating configurations:
# vi /opt/HDF-2.0.1.0/nifi/conf/nifi.properties 2) Change NIFI http port to run on 9090 as default 8080 will conflict with Ambari web UI # web properties #
nifi.web.http.port=9090 3) Configure NiFi instance to run site-to site by changing below configuration : add a port say 8055 and set "nifi.remote.input.secure" as "false" # Site to Site properties #
nifi.remote.input.socket.port=8055
nifi.remote.input.secure=false 4) Now Start [Restart if already running for configuration change to take effect] NiFi on your Sandbox. # /opt/HDF-2.0.1.0/nifi/bin/nifi.sh start 5) Make sure NiFi is up and running by connecting to its Web UI from your browser: http://your-vm-ip:9090/nifi/
Building a Flow in NiFi to fetch and parse nifi-app.log 1) Let us build a small flow on NiFi canvas to read app log generated by NiFi itself to feed to Storm: 2) Drop a "TailFile" Processor to canvas to read lines added to "/opt/HDF-2.0.1.0/nifi/logs/nifi-user.log". Auto Terminate relationship Failure. 3) Drop a "SplitText" Processor to canvas to split the log file into separate lines. Auto terminate Original and Failure Relationship for now. Connect TailFile processor to SplitText Processor for Success Relationship. 4) Drop a "ExtractText" Processor to canvas to extract portions of the log content to attributes as below. Connect SplitText processor to ExtractText Processor for splits relationship. - BULLETIN_LEVEL : ([A-Z]{4,5})
- CONTENT : (^.*)
- EVENT_DATE : ([^,]*)
- EVENT_TYPE : (?<=\[)(.*?)(?=\]) 5) Drop an OutputPort to the canvas and Name it "OUT", Once added, connect "ExtractText" to the port for matched relationship. The Flow would look similar as below: 6) Start the flow on NiFi and notice data is stuck in the connection before the output port "OUT"
Building Storm application jar with maven 1) To begin with, lets clone the artifacts, feel free to inspect the dependencies and NiFiStormStreaming.java # cd /opt/
# git clone https://github.com/jobinthompu/NiFi-Storm-Integration.git 2) Feel free the inspect pom.xml to verify the dependencies. # cd /opt/NiFi-Storm-Integration
# vi pom.xml 3) Lets rebuild Storm jar with artifacts (this might take several minutes). # mvn package 4) Once the build is SUCCESSFUL, make sure the NiFiStormTopology-Uber.jar is generated in the target folder: # ls -l /opt/NiFi-Storm-Integration/target/NiFiStormTopology-Uber.jar 5) Now let us go ahead and submit the topology in storm (make sure the NiFi flow created above is running before submitting topology). # cd /opt/NiFi-Storm-Integration
# storm jar target/NiFiStormTopology-Uber.jar NiFi.NiFiStormStreaming & 6) Lets Go ahead and verify the topology is submitted on the Storm View in Ambari as well as Storm UI: Ambari UI: http://your-vm-ip:8080 Storm UI: http://your-vm-ip:8744/index.html 7) Lets Go back to the NiFi Web UI, if everything worked fine, the data which was pending on the port OUT will be gone as it was consumed by Storm. 😎 Now Lets Connect to Phoenix and check out the data populated in tables, you can either use Phoenix sqlline command line or Zeppelin a) via phoenix sqlline # /usr/hdp/current/phoenix-client/bin/sqlline.py localhost:2181:/hbase-unsecure SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='ERROR' ORDER BY EVENT_DATE; b) via Zeppelin for better visualization Zeppelin UI: http://your-vm-ip:9995/ 9) No you can Change the code as needed, re-built the jar and re-submit the topologies. Extending NiFi Flow to ingest data directly to Phoenix using PutSql processor 1) Lets go ahead and kill the storm topology from command-line (or from Ambari Storm-View or Storm UI) # storm kill NiFi-Storm-Phoenix 2) Log back to NiFi UI currently running the flow, and stop the entire flow. 3) Drop a RouteOnAttribute processor to canvas for Matched relation from ExtractText processor and configure it with below property and auto terminate unmatched relation. DEBUG : ${BULLETIN_LEVEL:equals('DEBUG')}
ERROR : ${BULLETIN_LEVEL:equals('ERROR')}
INFO : ${BULLETIN_LEVEL:equals('INFO')}
WARN : ${BULLETIN_LEVEL:equals('WARN')} 4) Drop an AttributesToJSON processor to canvas with below configuration and connect RouteOnAttribute's DEBUG,ERROR,INFO,DEBUG relations to it. Attributes List : uuid,EVENT_DATE,BULLETIN_LEVEL,EVENT_TYPE,CONTENT
Destination : flowfile-content 5) Create and enable DBCPConnectionPool with name "Phoenix-Storm" with below configuration: Database Connection URL : jdbc:phoenix:sandbox.hortonworks.com:2181:/hbase-unsecure
Database Driver Class Name : org.apache.phoenix.jdbc.PhoenixDriver
Database Driver Location(s) : /usr/hdp/current/phoenix-client/phoenix-client.jar
6) Drop a ConvertJSONToSQL to canvas with below configuration, connect AttributesToJSON's success relation to it, auto terminate Failure relation for now after connecting to Phoenix-Storm DB Controller service. 7) Drop a ReplaceText processor canvas to update INSERT statements to UPSERT for Phoenix with below configuration, connect sql relation of ConvertJSONToSQL auto terminate original and Failure relation. 😎 Finally add a PutSQL processor with below configurations and connect it to ReplaceText's success relation and auto terminate all of its relations. 9) The final flow including both ingestion via Storm and direct to phoenix using PutSql is complete, it should look similar to below: 10) Now go ahead and start the flow to ingest data to both Tables via storm and directly from NiFi. 11) Login back to Zeppelin to see if data is populated in the NIFI_DIRECT table. %jdbc(phoenix)
SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='INFO' ORDER BY EVENT_DATE - Too Lazy to create flow??? download my flow template here. This completes the tutorial, You have successfully: - Installed and Configured HDF 2.0 on your HDP-2.5 Sandbox. - Created a Data flow to pull logs and then to Parse it and make it available on a Site-to-site enabled NiFi port. - Created a Storm topology to consume data from NiFi via Site-to-Site and Ingest it to Hbase via Phoenix. - Directly Ingested Data to Phoenix with PutSQL Processor in NiFi with out using Storm - Viewed the Ingested data from Phoenix command line and Zeppelin References: bbende's - nifi-storm Github Repo Thanks, Jobin George
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- HBase
- hdf
- How-ToTutorial
- NiFi
- Phoenix
- Storm