About TimothySpann

TimothySpann · ‎12-09-2020

Look at everything that has *Record in the name for anything like CSV, JSON, Parquet, AVRO, Logs. https://www.datainmotion.dev/2020/07/ingesting-all-weather-data-with-apache.html

TimothySpann · ‎12-08-2020

QueryRecord processor

TimothySpann · ‎12-03-2020

switch to records, never use convertjsontosql

TimothySpann · ‎12-03-2020

Use PutHive3Streaming instead from Kafka or PutORC

TimothySpann · ‎12-01-2020

The other server needs to have site-site remote connections enabled. https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.1/building-a-dataflow/content/configure-site-to-site-server-nifi-instance.html see https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#site_to_site_properties by default this is turned off nifi.remote.input.host nifi.remote.input.http.enabled

TimothySpann · ‎12-01-2020

Does 0176100c-8d25-196b-1f72-6befa5cab12a match the id for Input from NiFi 1 also right click to make sure you started. you may need to do a refresh. are both clusters or single nodes? Restart both nodes.

TimothySpann · ‎11-30-2020

check your timeouts turn off or fix any firewalls test any network calls from other machines. could also be the sFTP server you are reading from Connection timed out (Connection timed out); routing to comms.failure: java.io.IOException: Failed to obtain connection to remote host due to com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out (Connection timed out) java.io.IOException: Failed to obtain connection to remote host due to com.jcraft.jsch.JSchException: java.net.Connect Up the timeouts for the network calls. How many NIC cards do you have are they 10Gb+? What is the RAM? I recommend 32GB RAM with most to JVM, 30-32 cores. The best practice is to use Cloudera Flow Management with a Cloudera Manager's managed cluster it will make sure everything is running properly. You can also restart them to get a different leading node. Usually when you do sFTP you have only one node making the calls, so that's why that one will get timeout errors calling that SFTP server. Make the timeout greater, your SFTP may be slow or offline or blocked by firewall/gateway/proxy/linux network

TimothySpann · ‎11-12-2020

[FLaNK] Smart Weather Applications with Flink SQL Sometimes you want to acquire, route, transform, live query, and analyze all the weather data in the United States while those reports happen. With FLaNK, it's a trivial process to do. From Kafka to Kudu for Any Schema of Any Type of Data - No Code, Two Steps The Schema Registry has full Swagger-ized Runnable REST API Documentation. Integrate, DevOps, and Migration in a simple script. Here's your schemas, upload, edit, and compare. Validating Data Against a Schema With Your Approved Level of Tolerance. You want extra fields allowed, you got it: Feed that data to beautiful visual applications running in Cloudera Machine Learning. You like drill-down maps, you got them: Query your data fast with Apache Hue against Apache Kudu tables through Apache Impala: Let's ingest all the US weather stations even though they are a zipped directory of a ton of XML files: Weather Ingest is Easy Automagically! View All Your Topic Data Enabled by Schema Registry Even in Avro Format: Reference: Ingesting all weather data with Apache Source Build Query Kafka Insert Schemas Schemas1 Schemas2 SQL INSERT INTO weathernj SELECT `location`, station_id,latitude,longitude,observation_time,weather, temperature_string, temp_f,temp_c,relative_humidity,wind_string,wind_dir,wind_degrees,wind_mph, wind_kt, pressure_in,dewpoint_string,dewpoint_f,dewpoint_c FROM weather WHERE `location` is not null and `location` <> 'null' and trim(`location`) <> '' and `location` like '%NJ'; Example Slack Output 12:56 ========================================================= http://forecast.weather.gov/images/wtf/small/ovc.png Location Cincinnati/Northern Kentucky International Airport, KY Station KCVG Temperature: 49.0 F (9.4 C) Humdity: 83 Wind East at 3.5 MPH (3 KT) Overcast Dewpoint 44.1 F (6.7 C)Observed at Tue, 27 Oct 2020 11:52:00 -0400---- tracking info ---- UUID: 2cb6bd67-148c-497d-badf-dfffb4906b89 Kafka offset: 0 Kafka Timestamp: 1603818351260 ========================================================= [FLaNK] Smart Weather Websocket Application - Kafka Consumer This is based on Koji Kawamura's excellent GIST: As part of my Smart Weather Application, I wanted to display weather information as it arrives on a webpage using web sockets. Koji has an excellent NiFi flow that does it. I tweaked it and add some things since I am not using Zeppelin. I am hosting my webpage with NiFi as well. We simply supply a webpage that makes a WebSocket connection to NiFi and NiFi keeps a cache in HBase to know what the client is doing. This cache is updated by consuming from Kafka. We can then feed events as they happen to the page. Here is the JavaScript for the web page interface to WebSockets: <script> function sendMessage(type, payload) { websocket.send(makeMessage(type, payload)); } function makeMessage(type, payload) { return JSON.stringify({ 'type': type, 'payload': payload }); } var wsUri = "ws://edge2ai-1.dim.local:9091/test"; websocket = new WebSocket(wsUri); websocket.onopen = function(evt) { sendMessage('publish', { "message": document.getElementById("kafkamessage") }); }; websocket.onerror = function(evt) {console.log('ERR', evt)}; websocket.onmessage = function(evt) { var dataPoints = JSON.parse(evt.data); var output = document.getElementById("results"); var dataBuffer = "<p>"; for(var i=0;i<dataPoints.length;i++) { dataBuffer += " <img src=\"" + dataPoints[i].icon_url_base + dataPoints[i].icon_url_name + "\">  " + dataPoints[i].location + dataPoints[i].station_id + "@" + dataPoints[i].latitude + ":" + dataPoints[i].longitude + "@" + dataPoints[i].observation_time + dataPoints[i].temperature_string + "," + dataPoints[i].relative_humidity + "," + dataPoints[i].wind_string +"<br>"; } output.innerHTML = output.innerHTML + dataBuffer + "</p><br>"; }; </script> Video Walkthrough: https://www.twitch.tv/videos/797412192?es_id=bbacb7cb39 Source Code: https://github.com/tspannhw/SmartWeather/tree/main Kafka Topic weathernj Schema The schema registry has a live Swagger interface to it's REST API NiFi Flow Overview Ingest Via REST All US Weather Data from Zipped XML As Data Streamings In, We Can Govern It Ingested Data is Validated Against It's Schema Then Pushed to Kafka as Avro We consume that Kafka data in-store it in Kudu for analytics We host a web page for our Websockets Application in NiFi with 4 simple processors. Listen and Put Web Socket Messages Between NiFi Server and Web Application Kafka Data is Cached for Websocket Applications Set the Port for WebSockets via Jetty Web Server Use HBase As Our Cache We can monitor our Flink SQL application from the Global Flink Dashboard We can query our Weather data store in Apache Kudu via Apache Impala through Apache Hue Kudu Visualizations of Our Weather Data in Cloudera Visual Applications

TimothySpann · ‎11-06-2020

I don't recommend using the spark livy connector anymore. Also by default it's not setup for use for anything other than zeppelin. i would connect via kafka. in nifi you can check the livy controller settings and make sure it's not respawning new ones.

TimothySpann · ‎11-03-2020

Seems it is not setup properly. You have to have the same hostname as the hostname in your SSL certificate. Caused by: org.springframework.ldap.UncategorizedLdapException: Failed to negotiate TLS session; nested exception is javax.net.ssl.SSLPeerUnverifiedException: hostname of the server '' does not match the hostname in the server's certificate. Check out https://community.cloudera.com/t5/Support-Questions/NIFI-LDAPS-SEEMS-TO-FAIL/td-p/177054 Have you configured: login-identity-providers.xml I would recommend to upgrade to the latest CFM with Apache NiFi 1.11. The Cloudera Manager install process can setup all your SSL properly. You can open a ticket with Cloudera support through the support portal. Some SSL Links https://www.datainmotion.dev/2019/08/find-cacerts-from-java-jre-lib-security.html https://www.datainmotion.dev/2019/09/openssl-ssl-hosting-in-nifi.html

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Re: NIFI - How to route a record\event based on co...

Re: NIFI - How to route a record\event based on co...

Re: Nifi: Check hiveql.args.n.value file attribute...

Re: Nifi: Check hiveql.args.n.value file attribute...

Re: NIFI Site to site -what am I missing ?

Re: NIFI Site to site -what am I missing ?

Re: 1 node in the cluster getting excessive timeou...

[FLaNK] Smart Weather Applications with Flink SQL...

Re: Spark livy execution from nifi: livy-session a...

Re: NIFI 1.9 : Failed to negotiate TLS session; ne...