1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1939 | 04-03-2024 06:39 AM | |
| 3042 | 01-12-2024 08:19 AM | |
| 1667 | 12-07-2023 01:49 PM | |
| 2440 | 08-02-2023 07:30 AM | |
| 3396 | 03-29-2023 01:22 PM |
12-22-2016
09:10 PM
6 Kudos
Setting up a Websocket Client and Server with Apache NiFi 1.1. I wanted to test out the new WebSocket Listener in Apache NiFi 1.1, but I needed a server to serve up my HTML client. So I ran that web server with NiFi as well. So my full solution is hosted and runs through Apache NiFi. This simple WebSockets Server and Client does the hello world of web sockets, Echo! Whatever the client sends, we send it back. My Suggested Use Cases for WebSockets WebSocket Client to Slack Interface WebSocket Client to Email WebSocket Chat stored to Apache Phoenix WebSocket To Communicate From Mobile Web Apps WebSocket To Send Detailed Log Details From Enterprise Web Applications Directly To Log Ingest Platform, bypassing local filesystem. Step 1: HandleHTTPRequest This accepts the HTTP calls from browsers Step 2: ExecuteStreamCommand Returns HTML page (could do getfile or any number of other ways of getting the HTML as a flowfile) Step 3: HandleHttpResponse this serves up our web page to browsers. A StandardHTTPContextMap is required to store HTTP requests and responses to share them through the stream. Step 4: PutFile Just to keep logs of what's going on, I saw all the flow files to the local file system. Step 5: ListenWebSocket This is the actual web socket server listener, it is what our client will talk to. Step 6: PutWebSocket This is the reply to the web socket client. Web Sockets Server Web Sockets Client (Static HTML5 Page with Javascript) Hosted on NiFi Web Socket Conversation On the Client Side A Shell Script to Output The HTML5 Javascript WebSocket Client ➜ nifi-1.1.0.2.1.1.0-2 cat server.sh
cat /Volumes/Transcend/Apps/nifi-1.1.0.2.1.1.0-2/wsclient.html
➜ nifi-1.1.0.2.1.1.0-2 cat wsclient.html
<!DOCTYPE HTML>
<html>
<head>
<script type="text/javascript">
function WebSocketTest()
{
if ("WebSocket" in window)
{
alert("WebSocket is supported by your Browser!");
// Let us open a web socket
var ws = new WebSocket("ws://localhost:9998/echo");
ws.send("MSG: NIFI IS AWESOME");
ws.onopen = function()
{
// Web Socket is connected, send data using send()
ws.send("Message to send");
alert("Message is sent...");
};
ws.onmessage = function (evt)
{
var received_msg = evt.data;
alert("Message is received...");
};
ws.onclose = function()
{
// websocket is closed.
alert("Connection is closed...");
};
}
else
{
// The browser doesn't support WebSocket
alert("WebSocket NOT supported by your Browser!");
}
}
</script>
</head>
<body>
<div id="sse">
<a href="javascript:WebSocketTest()">Run WebSocket</a>
</div>
</body>
</html> Construct a Visual Web Server to Server up Static HTML 5 WebSocket Client Page https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.http.StandardHttpContextMap/index.html https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpRequest/index.html https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpRequest/additionalDetails.html https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpResponse/index.html https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.HandleHttpResponse/additionalDetails.html Listen For Websocket https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.websocket.ListenWebSocket/index.html Put Message Back https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.websocket.PutWebSocket/index.html Reference: http://www.infoworld.com/article/2609720/application-development/9-killer-uses-for-websockets.html https://en.wikipedia.org/wiki/WebSocket https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API https://tools.ietf.org/html/rfc6455 NiFi Template (This is for Apache NiFi 1.1.x) websockets.xml
... View more
Labels:
12-22-2016
08:16 PM
4 Kudos
Running Livy on HDP 2.5 Ingest Metrics REST API From Livy with Apache NiFi / HDF Use GetHTTP To Ingest The Status On Your Batch From Livy Running Livy The first step, we download Livy from github. To install on HDP 2.5, is simple. I found a node that wasn't too busy and put the project their. To run, it's simple: export SPARK_HOME=/usr/hdp/current/spark-client/
export HADOOP_CONF_DIR=/etc/hadoop/conf
nohup ./livy-server & That's it, you have a basic unprotected Livy instance running. This is important, there's no security on there. You should either put Knox in front of this or enable Livy's security. I wanted to submit a Scala Spark Batch Job. So I wrote a quick one below to have something to call. Source Code for Example Spark 1.6.2 Batch Application:
https://github.com/tspannhw/links https://community.hortonworks.com/repos/73816/links-spark-batch-application-example.html?shortDescriptionMaxLength=140 Step 1: GetFile Store File /opt/demo/sparkrun.js with JSON to trigger Spark through Livy. {"file": "/apps/Links.jar","className": "com.dataflowdeveloper.links.Links"} Step 2: PostHTTP Make the call to Livy REST API to submit Spark job. Step 3: PutHDFS Store results of call to Hadoop HDFS Livy Logs 16/12/21 22:50:25 INFO LivyServer: Using spark-submit version 1.6.2
16/12/21 22:50:25 WARN RequestLogHandler: !RequestLog
16/12/21 22:50:25 INFO WebServer: Starting server on http://tspanndev11.field.hortonworks.com:8998
16/12/21 22:51:20 INFO SparkProcessBuilder: Running '/usr/hdp/current/spark-client/bin/spark-submit' '--name' 'Livy' '--class' 'com.dataflowdeveloper.links.Links' 'hdfs://hadoopserver:8020/opt/demo/links.jar' '/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json'
16/12/21 22:51:20 INFO SessionManager: Registering new session 0 Spark Compiled JAR File Must Be Deployed to HDFS and Be Readable hdfs dfs -put Links.jar /appshdfs dfs -chmod 777 /apps/Links.jar Checking YARN for Our Application
yarn application --list
Submitting a Scala Spark Job Normal Style /bin/spark-submit --class "com.dataflowdeveloper.links.Links" --master yarn --deploy-mode cluster /opt/demo/Links.jar Deploying a Scala Spark Application Built With SBT scp target/scala-2.10/links.jar user@server:/opt/demo Reference: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-livy-rest-interface https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-create-standalone-application
https://github.com/cloudera/livy http://livy.io/quickstart.html https://community.hortonworks.com/articles/73355/adding-a-custom-processor-to-nifi-linkprocessor.html
https://github.com/tspannhw/Logs https://community.hortonworks.com/repos/73434/apache-spark-apache-logs-parser-example.html Liv REST API
Livy REST API for Interactive Spark Sessions (JSON) http://server:8998/sessions Livy Stats via REST API (JSON) http://server:8998/healthcheck?pretty=true http://server:8998/metrics?pretty=true Livy Batch Submit Status via REST API (JSON)
http://server:8998/batches To Submit to Livy from the Command Line
curl -X POST --data '{"file": "/opt/demo/links.jar","className": "com.dataflowdeveloper.links.Links","args": ["/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json"]}'
-H "Content-Type: application/json" http://server:8998/batches
NIFI Template livy.xml
... View more
Labels:
12-22-2016
07:26 PM
1 Kudo
Business Need: Build Tool Notifications + Fun SonicPi is an open source, multiple platform live music synthesizer tool for building music via coding. It's a pretty neat environment and runs on a Mac, RPi, Windows machine or Linux machine. I found a command-line tool that can trigger SonicPi to play your code. I have wired up a dataflow from NiFi to SonicPi, so I can trigger music on demand. My thought is I will tie this into Jenkins and play different songs depending on what's going on. Such as Test Failures, Build Complete, Build Broken, New Deployment, based on various steps in the SDLC. Note: The machine this running on will need Ruby, Gem, Apache NiFi, Sonic Pi and a sound card. Step 1: To trigger Sonic Pi, you must install Sonic Pi Cli, via gem install sonic-pi-cli. Step 2: Run the SonicPI software and have it up and running. As you can see it's a pretty awesome environment, somewhere between a nice IDE and a music tracker. Step 3: ExecuteStreamCommand This calls the command line tool to trigger Sonic Pi. NiFi 1.1 Tool Shell Script cat /Volumes/Transcend/Apps/nifi-1.1.0.2.1.1.0-2/nifi.rb | /usr/local/bin/sonic_pi Example SonicPi Code sample :bass_trance_c
sleep 1
sample :bass_thick_c
sleep 1
use_synth :prophet
play 30, release: 2
sleep 0.25
play 30
sleep 0.25
play 30
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30
sleep 0.25
use_synth :prophet
play 30
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
play 30
sleep 0.25
play 30
sleep 0.25
sample :bass_trance_c
sleep 1
sample :bass_thick_c
sleep 1
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :dsaw
play 30
sleep 0.25
sample :ambi_drone
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :dark_ambience
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
use_synth :prophet
play 30, release: 2
sleep 0.25
sample :bass_trance_c
sleep 1
sample :bass_thick_c
sleep 1 Reference:
http://widdersh.in/controlling-sonic-pi-from-vim-or-anywhere-else/ https://gist.github.com/jwinder/e59be201082cca694df9 http://sonic-pi.net/ Apache NiFi Template: sonicpi.xml
... View more
Labels:
12-22-2016
05:58 PM
Installation Guide http://kylin.apache.org/docs16/install/index.html Install Kylin on a beefy node (1 to start) that has Hadoop client and all the configuration files. Start it and follow their guide to using the UI Always watch for permissions. See instructions from Kylin for HDP here: http://kylin.apache.org/docs/install/hadoop_env.html Reference: https://community.hortonworks.com/questions/55282/unified-bi-semantic-layer.html
... View more
12-21-2016
07:29 PM
Github https://community.hortonworks.com/repos/73434/apache-spark-apache-logs-parser-example.html
... View more
12-21-2016
04:12 PM
Maybe make this your topic, try: myQueue?targetClient=1 You could also write your own processor. https://community.hortonworks.com/content/kbentry/73355/adding-a-custom-processor-to-nifi-linkprocessor.html
... View more
12-21-2016
02:28 PM
5 Kudos
Business Need I needed to extract links from web pages using JSoup. I originally wrote a microservice for my NiFi MP3 Jukebox. I built a custom NiFi Processor instead. It's a pretty simple process and there's a ton of great articles on how to do it referenced below. Custom NiFi Process Development Process One thing I found useful was having at least one JUnit to test running your processor. Deploying takes a while especially if you need to deploy to a cluster of servers. I found a lot of great NIFI Custom Processor Unit and Integration tests online (see reference area). It's really easy to develop custom processors in Java. In your tests you can input files and get out real files. In your tests you should have a saved copy of what the valid file should be and then you can compare the output. Your JUnits can be triggered from Jenkins or other build tools. NiFi Custom Processors can be developed in your standard development process (TDD, CD, CI, Autodeploy). For the NAR you just need to SCP your NAR file to all the NIFI nodes lib directory and restart NIFI. If you want to deploy your flow templates, you can do so with this tool. Test package com.dataflowdeveloper.processors.process;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.UnsupportedEncodingException;
import java.util.List;
import org.apache.nifi.util.MockFlowFile;
import org.apache.nifi.util.TestRunner;
import org.apache.nifi.util.TestRunners;
import org.junit.Before;
import org.junit.Test;
public class LinkProcessorTest {
private TestRunner testRunner;
@Before
public void init() {
testRunner = TestRunners.newTestRunner(LinkProcessor.class);
}
@Test
public void testProcessor() {
testRunner.setProperty("url", "http://sparkdeveloper.com");
try {
testRunner.enqueue(new FileInputStream(new File("src/test/resources/test.csv")));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
testRunner.run();
testRunner.assertValid();
List<MockFlowFile> successFiles = testRunner.getFlowFilesForRelationship(LinkProcessor.REL_SUCCESS);
for (MockFlowFile mockFile : successFiles) {
try {
System.out.println("FILE:" + new String(mockFile.toByteArray(), "UTF-8"));
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
}
}
} Step 1: Ingest / Parse a URL via NiFi (or could source from a file) Step 2: LinkProcessor url could be hardcoded or expression from attributes from previous processor. Step 3: UpdateAttribute to change the name of the resulting filename (add JSON, make unique). Building the Link Processor Run the included build.sh (on Linux or OSX), or run mvn install. Requires JDK 8 and Maven and internet access to build. Deploy the NAR! scp nifi-process-nar/target/nifi-linkextractor-nar-1.0-SNAPSHOT.nar PLACE:/place/ or copy it to your NIFI/lib directory locally. You can also get a release of the NAR from github. Maven Build Script (pom.xml) <?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.dataflowdeveloper</groupId>
<artifactId>linkextractor</artifactId>
<version>1.0-SNAPSHOT</version>
</parent>
<artifactId>nifi-process-processors</artifactId>
<packaging>jar</packaging>
<dependencies>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-processor-utils</artifactId>
</dependency>
<dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-mock</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<!-- jsoup HTML parser library @ http://jsoup.org/ -->
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.10.1</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.0</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
</project> Upgrades Recommended When I add this to a production flow, I would upsert this into Phoenix or convert to ORC for a Hive table. See my other articles listed below for examples of doing just that. Example File hdfs dfs -cat /linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json
[{"link":"","descr":"http://www.dataflowdeveloper.com/#"},{"link":"","descr":"http://twitter.com/paasdev"},{"link":"","descr":"http://www.dataflowdeveloper.com/#"},{"link":"","descr":"http://www.dataflowdeveloper.com/#"},{"link":"DataFlow Developer","descr":"http://www.dataflowdeveloper.com/"},{"link":"Programmable OCR with Tesseract","descr":"http://www.dataflowdeveloper.com/2016/09/21/programmable-ocr-with-tesseract/"},{"link":"Python Text Searchinv","descr":"https://pypi.python.org/pypi/Whoosh/"},{"link":"Simple Bayes Text Classifier","descr":"https://pypi.python.org/pypi/simplebayes/"},{"link":"Python Wrapper for Tesseract","descr":"https://github.com/jflesch/pyocr/"},{"link":"Lector","descr":"https://github.com/zdenop/lector"},{"link":"VietOCR","descr":"http://vietocr.sourceforge.net/"},{"link":"OCRivist","descr":"http://www.ocrivist.com/"},{"link":"TesseractGUI","descr":"http://tesseract-gui.sourceforge.net/"},{"link":"Tesseract4J","descr":"https://github.com/tesseract4java/tesseract4java"},{"link":"Java wrapper for Tesseract","descr":"http://tess4j.sourceforge.net/"},{"link":"Basic Tesseract OCR Engine","descr":"https://github.com/tesseract-ocr/tesseract"},{"link":"Command Line Example Usage","descr":"https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage"},{"link":"Tesseract OCR Wiki","descr":"https://github.com/tesseract-ocr/tesseract/wiki"},{"link":"OCR Engine PDF","descr":"https://github.com/tesseract-ocr/docs/blob/master/das_tutorial2016/7Building%20a%20Multi-Lingual%20OCR%20Engine.pdf"},{"link":"Homebrew","descr":"http://brew.sh/"},{"link":"Running Tesseract from NiFI","descr":"https://issues.apache.org/jira/browse/NIFI-1815"},
Source Code https://github.com/tspannhw/linkextractorprocessor Articles for Storing to Phoenix, ORC and More https://community.hortonworks.com/articles/34287/using-gui-sql-tools-against-hive-on-hdp-from-macos.html https://community.hortonworks.com/articles/34362/parsing-apache-log-files-with-spark.html https://community.hortonworks.com/content/kbentry/54947/reading-opendata-json-and-storing-into-phoenix-tab.html https://community.hortonworks.com/content/kbentry/55839/reading-sensor-data-from-remote-sensors-on-raspber.html
https://community.hortonworks.com/content/kbentry/56642/creating-a-spring-boot-java-8-microservice-to-read.html https://community.hortonworks.com/articles/59394/csv-to-avro-conversion-with-nifi.html https://community.hortonworks.com/articles/59349/hdf-20-flow-for-ingesting-real-time-tweets-from-st.html https://community.hortonworks.com/articles/59975/ingesting-edi-into-hdfs-using-hdf-20.html https://community.hortonworks.com/content/kbentry/59975/ingesting-edi-into-hdfs-using-hdf-20.html https://community.hortonworks.com/content/kbentry/59394/csv-to-avro-conversion-with-nifi.html https://community.hortonworks.com/articles/60480/using-images-stored-in-hdfs-for-web-pages.html https://community.hortonworks.com/articles/58265/analyzing-images-in-hdf-20-using-tensorflow.html https://community.hortonworks.com/articles/61180/streaming-ingest-of-google-sheets-into-a-connected.html https://community.hortonworks.com/articles/61717/ingesting-jms-messages-to-hdfs-via-hdf-20.html https://community.hortonworks.com/articles/47854/accessing-facebook-page-data-from-apache-nifi.html https://community.hortonworks.com/articles/63228/monitoring-your-containers-with-sysdig-from-hdf-20.html http://hortonworks.com/blog/hdf-2-0-flow-processing-real-time-tweets-strata-hadoop-slack-tensorflow-phoenix-zeppelin/ https://community.hortonworks.com/content/kbentry/63228/monitoring-your-containers-with-sysdig-from-hdf-20.html https://community.hortonworks.com/articles/64069/converting-a-large-json-file-into-csv.html https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.html Reference
http://jsonpath.com/ https://git-wip-us.apache.org/repos/asf?p=nifi.git;a=tree;f=nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard
http://datadidit.com/blog/index.php/2016/12/19/how-to-make-a-custom-nifi-processor/
http://goessner.net/articles/JsonPath/index.html#e2 https://github.com/abajwa-hw/nifi-network-processor http://www.nifi.rocks/developing-a-custom-apache-nifi-processor-json/ https://github.com/aperepel/nifi-workshop http://bryanbende.com/development/2015/02/04/custom-processors-for-apache-nifi https://www.compose.com/articles/what-you-need-to-know-to-extend-nifi/ http://stampedecon.com/blog/2016/06/01/apache-nifi-not-from-scratch/ http://www.nifi.rocks/developing-a-custom-apache-nifi-processor-json/ Example Flow File link-processor.xml
... View more
Labels:
12-21-2016
03:42 AM
https://community.hortonworks.com/questions/31915/adding-new-node-to-cluster.html https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_administration/content/ref-4303e343-9aee-4e70-b38a-2837ae976e73.1.html
... View more
12-20-2016
11:33 PM
Zeppelin is a different story. http://hortonworks.com/blog/introduction-to-data-science-with-apache-spark/ Try this tutorial http://hortonworks.com/hadoop-tutorial/intro-machine-learning-apache-spark-apache-zeppelin/
... View more