1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1922 | 04-03-2024 06:39 AM | |
| 3018 | 01-12-2024 08:19 AM | |
| 1650 | 12-07-2023 01:49 PM | |
| 2422 | 08-02-2023 07:30 AM | |
| 3367 | 03-29-2023 01:22 PM |
05-20-2017
02:46 PM
I am going to try a reboot and see if I get a different message. 2017-05-20 10:37:53,174 ERROR [Timer-Driven Process Thread-3] o.a.n.p.standard.QueryDatabaseTable QueryDatabaseTable[id=65501581-100f-1159-01e0-ac486442f2d0] Unable to execute SQL select query SELECT * FROM TRIALS due to org.apache.nifi.processor.exception.ProcessException: Error during database query or conversion of records to Avro.: {}
org.apache.nifi.processor.exception.ProcessException: Error during database query or conversion of records to Avro.
at org.apache.nifi.processors.standard.QueryDatabaseTable.lambda$onTrigger$0(QueryDatabaseTable.java:289)
at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2377)
at org.apache.nifi.processors.standard.QueryDatabaseTable.onTrigger(QueryDatabaseTable.java:283)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1118)
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:144)
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: ERROR 1101 (XCL01): ResultSet is closed.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:441)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.jdbc.PhoenixResultSet.checkOpen(PhoenixResultSet.java:215)
at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:772)
at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
at org.apache.nifi.processors.standard.util.JdbcCommon.convertToAvroStream(JdbcCommon.java:103)
at org.apache.nifi.processors.standard.QueryDatabaseTable.lambda$onTrigger$0(QueryDatabaseTable.java:287)
... 13 common frames omitted
... View more
Labels:
- Labels:
-
Apache NiFi
05-12-2017
06:30 PM
Minio should probably be in the same region as those regions are Amazon S3 specific. Minio runs on one machine and is not part of AWS infrastructure.
... View more
05-09-2017
07:19 PM
1 Kudo
Did you shut down and restart NIFI? Also does the user running nifi have permissions to all the directories, files and kerberos. Can you access hbase from the nifi machine using hbase cli or python?
... View more
05-09-2017
01:45 PM
5 Kudos
See Part 1: https://community.hortonworks.com/articles/101679/iot-ingesting-gps-data-from-raspberry-pi-zero-wire.html
Augment Your Streaming Ingested Data
So with Hortonworks Data Flow : Flow Processing I can augment any stream of data in motion. My GPS data stream from my RPIWZ is returning latitude and longitude. That is valuable data that can be input to a lot of different APIs. One that seemed relevant to me was the current weather conditions where the device was. You could also do some predictive analytics to figure where you might be and check the current weather conditions or a weather forecast depending on how long it will take you to get there.
A GPS is giving us latitude and longitude that can be used as input to many kinds of APIs. Once use case that works for a moving device is checking the current weather conditions where you are. Weather Underground has a REST API call that will convert Lat/Long to City/State (there are other APIs that will do this as well, please post them). Once I have City and State I use that to get current weather conditions where this device is.
Apache NiFi Flow
Augment Live Data
EvaluateJsonPath
: create two variables (latitude and longitude) by extracting JSON fields from the flow file sent via MQTT. $.latitude and $.longitude
InvokeHTTP: http://api.wunderground.com/api/TheOneKey/geolookup/q/${latitude},${longitude}.json
EvaluateJsonPath: create two variables (city and state) by extracting JSON fields from the flow file sent from REST call. $.location.city $.location.state.
InvokeHTTP: http://api.wunderground.com/api/OneKeyToRule/conditions/q/${state}/${city:urlEncode()}.json City has space and perhaps other HTTP GET unfriendly data.
EvaluateJsonPath
: extract all the weather fields I am interested in such as $.current_observation.dewpoint_string.
UpdateAttribute: A few field names Avro Schema's do not like. ${'Last-Modified'} I convert that to lastModified.
AttributesToJSON: Convert just the fields I want to a new flow file. Set property to flowfile-content.
temp_c,temp_f,wind_mph,wind_degrees,temperature_string,weather,feelslike_string,state,longitude,windchill_string,wind_string,relative_humidity,pressure_mb,observation_time,city,windchill_f,windchill_c,latitude,precip_today_string,lastModified,dewpoint_string,heat_index_string,visibility_mi
InferAvroSchema
: Let Nifi decide what this AVRO record should look like. I will add a part three using the Schema Registry and Apache NiFi 1.2 that will use a schema lookup.
ConvertJSONtoAvro
: And make it SNAPPY!
MergeContent: As AVRO.
MergeContent: As AVRO.
ConvertAvroToORC
: ORC is the optimal choice for Hive LLAP and Spark SQL to query the data.
PutHDFS
: Point to your configuration Nifi relative file system file /etc/hadoop/conf/core-site.xml. I like to set for replace conflict resolution and set your directory. Make sure you created your file directory and Nifi has permissions to
write there.
The quickest is to log into a machine with HDFS client.
su hdfs
hdfs dfs -mkdir -p /rpwz/gpsweather
hdfs dfs -chmod -R 777 /rpwz
Enhancement Ideas:
Limit the # of calls to Weather Underground, try other weather services like NOAA and Weather Source. You may want to try other nation's weather APIs as well.
Create an External Hive Table and Display the Data (Zeppelin Can Do That)
Zeppelin let's me create the table with the DDL produced by the stream.
Then I can query it easily.
%jdbc(hive)
CREATE EXTERNAL TABLE IF NOT EXISTS gpsweather (observation_time STRING, dewpoint_string STRING, city STRING, windchill_f STRING, windchill_c STRING, latitude STRING, precip_today_string STRING, temp_c STRING, temp_f STRING, windchill_string STRING, wind_mph STRING, wind_degrees STRING, temperature_string STRING, weather STRING, feelslike_string STRING, wind_string STRING, heat_index_string STRING, state STRING, lastModified STRING, relative_humidity STRING, pressure_mb STRING, visibility_mi STRING, longitude STRING)
STORED AS ORC LOCATION '/rpwz/gpsweather'
%jdbc(hive)
select * from gpsweather
Reference:
http://www.wunderground.com/
https://openweathermap.org/forecast5
https://developers.google.com/maps/
Note:
Weather Underground has a free API that can use up to a certain # of calls. You will easily blast past that if you are tracking live objects, even if you decide to only enrich when points change or at 15 minute intervals.
For testing, you may want to store weather underground results in JSON files in a directory and use GetFile to stand in for those calls. The same with data from your devices. Download the files from the Data Provenance and you
can use them as stand-ins for live data for integration testing, especially when you may be on a plane or somewhere where you cannot reach your device.
... View more
Labels:
05-08-2017
08:10 PM
1 Kudo
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/install_sqoop_rpms.html https://community.hortonworks.com/questions/79412/execute-sqoop-on-nifi.html https://community.hortonworks.com/questions/25228/can-i-use-nifi-to-replace-sqoop.html My example calling Flume from NiFi https://community.hortonworks.com/articles/48271/using-apache-flume-sources-and-sinks-with-apache-n.html https://community.hortonworks.com/questions/47957/using-nifi-to-quey-rdbms.html http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/installing_flume.html
... View more
05-08-2017
07:23 PM
8 Kudos
Where in the world is Tim Spann, The NiFi Guy? I recommend the BU-353-S4 USB GPS, it works well with Raspberry PIs and is very affordable. Connecting this to a RPIWZ, I can run this on a small battery and bring this everywhere for location tracking. Put it on your delivery truck, chasis, train, plane, drone, robot and more. I'll track myself so I can own the data.
What do these GPS Fields Mean? EPS - Error Estimate in Meter/Second EPX = Estimated Longitude Error in Meters EPV = Estimated Vertical Error in Meters EPT = Estimated Timestamp Error Speed = Speed !!! Climb = Climb (Positive) or Sink (Negative) rate in meters per second of upwards or downwards movement. Track = Course over ground in degrees from True North Mode = NMEA mode; values are 0 - NA, 1 - No Fix, 2D and 3D. My question is if you already have an estimate of the error, do something about it!!! Python Walk Through First install the utilities you need for GPS and Python. We also install NTP to get as accurate time as possible. sudo apt-get install gpsd gpsd-clients python-gps ntp For testing to make sure that everything works, try two of these GPS utilities. Make sure you have the USB plugged in, you will need a RPIZero adapter to convert from little USB to normal size. I then connect a small USB hub to connect the GPS unit as well as sometimes mouse and keyboard. Get one of these, you will need it. You will also need an adapter from little to full size HDMI. You only really need the mouse, keyboard and monitor while you are doing the first setup up WIFI. Once that's setup, just SSH into your device and forget it.
cgps gpsmon
gpxlogger dumps XML data in GPX format gpspipe -l -o test.json -p -w -n 10 without -o goes to STDOUT/STIN These will work from the command line and give you a read out. It will take a few seconds or maybe a minute the first time to calibrate. If you don't get any numbers, stick your GPS on a window or put it outside. If you have to manually run the GPS demon: gpsd -n -D 2 /dev/ttyUSB0 I found some code to read the GPS sensor over USB with Python. From there I modified the code for a slower refresh and no UI as I want to just send this data to Apache NiFi over MQTT using Eclipse Paho MQTT client. One enhancement I have considered is an offline mode to save all the data as a buffer and then on reconnect mass send the rest. Also you could search for other WiFi signals and try to use open and free ones. You probably want to then add SQL, encryption and some other controls. Or you could install and use Apache MiniFi Java or C++ agent on the Zero. #! /usr/bin/python
# Based on
# Written by Dan Mandle http://dan.mandle.me September 2012
# License: GPL 2.0
import os
from gps import *
from time import *
import time
import threading
import json
import paho.mqtt.client as paho
gpsd = None
class GpsPoller(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
global gpsd #bring it in scope
gpsd = gps(mode=WATCH_ENABLE) #starting the stream of info
self.current_value = None
self.running = True #setting the thread running to true
def run(self):
global gpsd
while gpsp.running:
gpsd.next() #this will continue to loop and grab EACH set of gpsd info to clear the buffer
if __name__ == '__main__':
gpsp = GpsPoller() # create the thread
try:
gpsp.start() # start it up
while True:
if gpsd.fix.latitude > 0:
row = [ { 'latitude': str(gpsd.fix.latitude),
'longitude': str(gpsd.fix.longitude),
'utc': str(gpsd.utc),
'time': str(gpsd.fix.time),
'altitude': str(gpsd.fix.altitude),
'eps': str(gpsd.fix.eps),
'epx': str(gpsd.fix.epx),
'epv': str(gpsd.fix.epv),
'ept': str(gpsd.fix.ept),
'speed': str(gpsd.fix.speed),
'climb': str(gpsd.fix.climb),
'track': str(gpsd.fix.track),
'mode': str(gpsd.fix.mode)} ]
json_string = json.dumps(row)
client = paho.Client()
client.username_pw_set("jrfcwrim","UhBGemEoqf0D")
client.connect("m13.cloudmqtt.com", 14162, 60)
client.publish("rpiwzgps", payload=json_string, qos=0, retain=True)
time.sleep(60)
except (KeyboardInterrupt, SystemExit): #when you press ctrl+c
gpsp.running = False
gpsp.join() # wait for the thread to finish what it's doing
Example JSON Data {"track": "0.0", "speed": "0.0", "utc": "2017-05-01T23:49:46.000Z", "epx": "8.938", "epv": "29.794", "altitude": "40.742", "eps": "23.66", "longitude": "-74.529216408", "mode": "3", "time": "2017-05-01T23:49:46.000Z", "latitude": "40.268141521", "climb": "0.0", "ept": "0.005"} Ingest Any Data Any Where Any Time Any Dimension in Time and Space
1. ConsumeMQTT 2. InferAvroSchema 3. ConvertJSONtoAvro 4. MergeContent 5. ConvertAvroToORC 6. PutHDFS
Visualize Whirled Peas
We turn this raw data into Hive tables and then visualize into pretty tables and charts with Apache Zeppelin. You could also report with any ODBC and JDBC reporting tool like Tableau or PowerBI.
Store It Where? su hdfs
hdfs dfs -mkdir -p /rpwz/gps
hdfs dfs -chmod -R 777 /rpwz/gps If you store it you can query it!
Why Do I love Apache NiFi? Let me count the ways.... Instead of hand-rolling some Hive DDL, NiFi will automagically generate all the DDL I need based on an inferred AVRO Schema (soon using Schema Registry lookup!!!). So you can easily drop in a file, convert to ORC, save to HDFS, generate an external Hive table and query it in seconds. All with no coding. Very easy to wire this to send a message to a front end via Web Sockets, JMS, AJAX, ... So we can drop a file in S3 or HDFS, convert to ORC for mega fast LLAP queries and tell a front-end what the table is and it could query it.
References: http://www.danmandle.com/blog/getting-gpsd-to-work-with-python/ http://www.d.umn.edu/~and02868/projects.html http://kingtidesailing.blogspot.com/2016/02/connect-globalsat-bu-353-usb-gps-to.html http://blog.petrilopia.net/linux/raspberry-pi-gps-working/ https://community.hortonworks.com/articles/72420/ingesting-remote-sensor-feeds-into-apache-phoenix.html https://www.systutorials.com/docs/linux/man/1-gpspipe/ https://www.systutorials.com/docs/linux/man/1-gps/ http://usglobalsat.com/p-688-bu-353-s4.aspx
https://www.raspberrypi.org/products/pi-zero-w/ Source Code:
https://community.hortonworks.com/repos/101680/rpi-zero-wireless-nifi-mqtt-gps.html?shortDescriptionMaxLength=140
Quick Tip:
Utilize Apache NiFi's scheduler to limit the # of calls you make to third-party services, a single NiFi instance can easily overwhelm most free tiers of services. I made 122 calls to Weather Underground in a few seconds. So set those times! for weather once every 15 minutes or 30 or even 1 hour is good. Note: GPS information can also be read from drones, cars, phones and lots of custom sensors in IIOT devices.
... View more
Labels:
05-08-2017
05:40 PM
The complete code is in the Zeppelin Spark Python notebook referenced here: https://github.com/zaratsian/PySpark/blob/master/text_analytics_datadriven_topics.json
... View more
05-08-2017
05:39 PM
Thanks for the write up. It is probably permissions required for writing temp space and Hive datawarehouse HDFS structure. Always good to use permissions. If you had kerberos or required logins or ran with a different user you may face issues as well.
... View more
05-05-2017
05:00 PM
Traceback (most recent call last):
File "testhive.py", line 1, in <module>
from pyhive import hive
File "/usr/local/lib/python2.7/site-packages/pyhive/hive.py", line 10, in <module>
from TCLIService import TCLIService
File "/usr/local/lib/python2.7/site-packages/TCLIService/TCLIService.py", line 9, in <module>
from thrift.Thrift import TType, TMessageType, TFrozenDict, TException, TApplicationException
ImportError: No module named thrift.Thrift Any ideas?
... View more