Member since
10-09-2015
86
Posts
179
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
25088 | 12-29-2016 05:19 PM | |
1834 | 12-17-2016 06:05 PM | |
14689 | 08-24-2016 03:08 PM | |
2142 | 07-14-2016 02:35 AM | |
3980 | 07-08-2016 04:29 PM |
02-13-2017
09:29 AM
4 Kudos
Hi @Avijeet Dash What @Jobin George suggested would help to share common static configuratiosn at various part of a NiFi flow. In addition to that, if you'd like to know how to Put/Get from distributed cache, and how to enrich FlowFiles with cached values, this example might be helpful: Template file is available here: https://gist.github.com/ijokarumawak/8ba9a2a1b224603f877e960a942a6f2b Thanks, Koji
... View more
02-06-2017
09:27 PM
@Saurabh Verma How to change the JDK version for an existing cluster
Re-run Ambari Server Setup. ambari-server setup At the prompt to change the JDK, Enter y. Do you want to change Oracle JDK [y/n] (n)? y At the prompt to choose a JDK, Enter 1 to change the JDK to v1.8. [1] - Oracle JDK 1.8 [2] - Oracle JDK 1.7 [3] - Custom JDK If you choose Oracle JDK 1.8 or Oracle JDK 1.7, the JDK you choose downloads and installs automatically on the Ambari Server host. This option requires that you have an internet connection. You must install this JDK on all hosts in the cluster to this same path. If you choose Custom JDK , verify or add the custom JDK path on all hosts in the cluster. Use this option if you want to use OpenJDK or do not have an internet connection (and have pre-installed the JDK on all hosts). After setup completes, you must restart each component for the new JDK to be used.
Important You must also update your JCE security policy files on the Ambari Server and all hosts in the cluster to match the new JDK version. If you do not update the JCE to match the JDK, you may have issues starting services. Refer to the Ambari Security Guide for more information on Installing the JCE. Thanks, Matt
... View more
09-10-2018
02:12 PM
I too faced the same issue of not receiving emails,although configuration was set as mentioned in the above article,then i added this property in botstrap-notification.xml and it worked for me. since my smtp didnt require any username and password authentication, i deleted those properties and added this one, <property name="SMTP Auth">false</property> (note this needs to be added additionally its not already there in default xml, need to false it as its default value is true (reference:https://github.com/apache/nifi/blob/master/nifi-bootstrap/src/main/java/org/apache/nifi/bootstrap/notification/email/EmailNotificationService.java) please vote up if my answer helped.
... View more
05-01-2019
03:50 PM
It is confusing what triggers this task to run. Do you have any additional info on that or know if there is any way to configure it more precisely?
... View more
11-02-2017
03:30 PM
@Jobin George Can you please suggest how to remove anonymous user by getting default login to Nifi UI. I can login Nifi UI with my LDAP user but Nifi is also accessible with anonymous user without password. I wanted to disable it. In ranger policy if I remove {user} from user section then I cannot login Nifi UI with LDAP user and also it doesn't get default login with anonymous. Please suggest. Brief description is mentioned on below link. https://community.hortonworks.com/questions/142667/how-to-give-permissions-to-users-to-access-nifi-ui.html?childToView=145984#answer-145984
... View more
01-25-2017
12:12 AM
4 Kudos
Introduction
Using NiFi, data can be exposed in such a way that a receiver can pull from it by adding an Output Port to the root process group. For Storm, we will use this same mechanism - we will use the Site-to-Site protocol to pull data from NiFi's Output Ports. In this tutorial we learn to capture NiFi app log from the Sandbox and parse it using Java regex and ingest it to Phoenix via Storm or Directly using NiFi PutSql Processor.
Prerequisites 1) Assuming you already have latest version of NiFi-1.x/HDF-2.x downloaded as zip file (HDF and HDP cannot be managed by Ambari on same nodes as of now) on to your HW Sandbox Version 2.5, else execute below after ssh connectivity to sandbox is established: # cd /opt/
# wget http://public-repo-1.hortonworks.com.s3.amazonaws.com/HDF/centos6/2.x/updates/2.0.1.0/HDF-2.0.1.0-centos6-tars-tarball.tar.gz # tar -xvf HDF-2.0.1.0-12.tar.gz 2) Storm, Zeppelin are Installed on your VM and started. 3) Hbase is Installed with phoeix Query Server. 4) Make sure Maven is installed, if not already, execute below steps: # curl -o /etc/yum.repos.d/epel-apache-maven.repo https://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo
# yum -y install apache-maven
# mvn -version
Configuring and Creating Table in Hbase via Phoenix 1) Make sure Hbase components as well as phoenix query server is started. 2) Make sure Hbase is up and running and out of maintenance mode, below properties are set(if not set it and restart the services): - Enable Phoenix --> Enabled
- Enable Authorization --> Off 3) Create Phoenix Table after connecting to phoenix shell (or via Zeppelin): # /usr/hdp/current/phoenix-client/bin/sqlline.py sandbox.hortonworks.com:2181:/hbase-unsecure 4) Execute below in the Phoenix shell to create tables in Hbase: CREATE TABLE NIFI_LOG( UUID VARCHAR NOT NULL, EVENT_DATE DATE, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
CREATE TABLE NIFI_DIRECT( UUID VARCHAR NOT NULL, EVENT_DATE VARCHAR, BULLETIN_LEVEL VARCHAR, EVENT_TYPE VARCHAR, CONTENT VARCHAR CONSTRAINT pk PRIMARY KEY(UUID));
Configuring and Starting NiFi 1) Open nifi.properties for updating configurations:
# vi /opt/HDF-2.0.1.0/nifi/conf/nifi.properties 2) Change NIFI http port to run on 9090 as default 8080 will conflict with Ambari web UI # web properties #
nifi.web.http.port=9090 3) Configure NiFi instance to run site-to site by changing below configuration : add a port say 8055 and set "nifi.remote.input.secure" as "false" # Site to Site properties #
nifi.remote.input.socket.port=8055
nifi.remote.input.secure=false 4) Now Start [Restart if already running for configuration change to take effect] NiFi on your Sandbox. # /opt/HDF-2.0.1.0/nifi/bin/nifi.sh start 5) Make sure NiFi is up and running by connecting to its Web UI from your browser: http://your-vm-ip:9090/nifi/
Building a Flow in NiFi to fetch and parse nifi-app.log 1) Let us build a small flow on NiFi canvas to read app log generated by NiFi itself to feed to Storm: 2) Drop a "TailFile" Processor to canvas to read lines added to "/opt/HDF-2.0.1.0/nifi/logs/nifi-user.log". Auto Terminate relationship Failure. 3) Drop a "SplitText" Processor to canvas to split the log file into separate lines. Auto terminate Original and Failure Relationship for now. Connect TailFile processor to SplitText Processor for Success Relationship. 4) Drop a "ExtractText" Processor to canvas to extract portions of the log content to attributes as below. Connect SplitText processor to ExtractText Processor for splits relationship. - BULLETIN_LEVEL : ([A-Z]{4,5})
- CONTENT : (^.*)
- EVENT_DATE : ([^,]*)
- EVENT_TYPE : (?<=\[)(.*?)(?=\]) 5) Drop an OutputPort to the canvas and Name it "OUT", Once added, connect "ExtractText" to the port for matched relationship. The Flow would look similar as below: 6) Start the flow on NiFi and notice data is stuck in the connection before the output port "OUT"
Building Storm application jar with maven 1) To begin with, lets clone the artifacts, feel free to inspect the dependencies and NiFiStormStreaming.java # cd /opt/
# git clone https://github.com/jobinthompu/NiFi-Storm-Integration.git 2) Feel free the inspect pom.xml to verify the dependencies. # cd /opt/NiFi-Storm-Integration
# vi pom.xml 3) Lets rebuild Storm jar with artifacts (this might take several minutes). # mvn package 4) Once the build is SUCCESSFUL, make sure the NiFiStormTopology-Uber.jar is generated in the target folder: # ls -l /opt/NiFi-Storm-Integration/target/NiFiStormTopology-Uber.jar 5) Now let us go ahead and submit the topology in storm (make sure the NiFi flow created above is running before submitting topology). # cd /opt/NiFi-Storm-Integration
# storm jar target/NiFiStormTopology-Uber.jar NiFi.NiFiStormStreaming & 6) Lets Go ahead and verify the topology is submitted on the Storm View in Ambari as well as Storm UI: Ambari UI: http://your-vm-ip:8080 Storm UI: http://your-vm-ip:8744/index.html 7) Lets Go back to the NiFi Web UI, if everything worked fine, the data which was pending on the port OUT will be gone as it was consumed by Storm. 😎 Now Lets Connect to Phoenix and check out the data populated in tables, you can either use Phoenix sqlline command line or Zeppelin a) via phoenix sqlline # /usr/hdp/current/phoenix-client/bin/sqlline.py localhost:2181:/hbase-unsecure SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='ERROR' ORDER BY EVENT_DATE; b) via Zeppelin for better visualization Zeppelin UI: http://your-vm-ip:9995/ 9) No you can Change the code as needed, re-built the jar and re-submit the topologies. Extending NiFi Flow to ingest data directly to Phoenix using PutSql processor 1) Lets go ahead and kill the storm topology from command-line (or from Ambari Storm-View or Storm UI) # storm kill NiFi-Storm-Phoenix 2) Log back to NiFi UI currently running the flow, and stop the entire flow. 3) Drop a RouteOnAttribute processor to canvas for Matched relation from ExtractText processor and configure it with below property and auto terminate unmatched relation. DEBUG : ${BULLETIN_LEVEL:equals('DEBUG')}
ERROR : ${BULLETIN_LEVEL:equals('ERROR')}
INFO : ${BULLETIN_LEVEL:equals('INFO')}
WARN : ${BULLETIN_LEVEL:equals('WARN')} 4) Drop an AttributesToJSON processor to canvas with below configuration and connect RouteOnAttribute's DEBUG,ERROR,INFO,DEBUG relations to it. Attributes List : uuid,EVENT_DATE,BULLETIN_LEVEL,EVENT_TYPE,CONTENT
Destination : flowfile-content 5) Create and enable DBCPConnectionPool with name "Phoenix-Storm" with below configuration: Database Connection URL : jdbc:phoenix:sandbox.hortonworks.com:2181:/hbase-unsecure
Database Driver Class Name : org.apache.phoenix.jdbc.PhoenixDriver
Database Driver Location(s) : /usr/hdp/current/phoenix-client/phoenix-client.jar
6) Drop a ConvertJSONToSQL to canvas with below configuration, connect AttributesToJSON's success relation to it, auto terminate Failure relation for now after connecting to Phoenix-Storm DB Controller service. 7) Drop a ReplaceText processor canvas to update INSERT statements to UPSERT for Phoenix with below configuration, connect sql relation of ConvertJSONToSQL auto terminate original and Failure relation. 😎 Finally add a PutSQL processor with below configurations and connect it to ReplaceText's success relation and auto terminate all of its relations. 9) The final flow including both ingestion via Storm and direct to phoenix using PutSql is complete, it should look similar to below: 10) Now go ahead and start the flow to ingest data to both Tables via storm and directly from NiFi. 11) Login back to Zeppelin to see if data is populated in the NIFI_DIRECT table. %jdbc(phoenix)
SELECT EVENT_DATE,EVENT_TYPE,BULLETIN_LEVEL FROM NIFI_DIRECT WHERE BULLETIN_LEVEL='INFO' ORDER BY EVENT_DATE - Too Lazy to create flow??? download my flow template here. This completes the tutorial, You have successfully: - Installed and Configured HDF 2.0 on your HDP-2.5 Sandbox. - Created a Data flow to pull logs and then to Parse it and make it available on a Site-to-site enabled NiFi port. - Created a Storm topology to consume data from NiFi via Site-to-Site and Ingest it to Hbase via Phoenix. - Directly Ingested Data to Phoenix with PutSQL Processor in NiFi with out using Storm - Viewed the Ingested data from Phoenix command line and Zeppelin References: bbende's - nifi-storm Github Repo Thanks, Jobin George
... View more
04-26-2019
12:00 PM
You can use replacetext processor and in place of replacement use the attribute name you want to have in your flow file. One thing you must keep in mind that attributes are stored in memory.
... View more
12-19-2016
04:18 AM
Thank you very much Jobin George
... View more
12-14-2016
12:06 AM
Great!!.. may be you can include the back pressure coloring feature and a screenshot as well..
... View more
02-24-2017
09:46 PM
@Matt Clarke G1GC was set as garbage collector and Issue was fixed in the next version of HDF. This one went unnoticed, accepting the answer. Thanks, Jobin George
... View more