So with Hortonworks Data Flow : Flow Processing I can augment any stream of data in motion. My GPS data stream from my RPIWZ is returning latitude and longitude. That is valuable data that can be input to a lot of different APIs. One that seemed relevant to me was the current weather conditions where the device was. You could also do some predictive analytics to figure where you might be and check the current weather conditions or a weather forecast depending on how long it will take you to get there.
A GPS is giving us latitude and longitude that can be used as input to many kinds of APIs. Once use case that works for a moving device is checking the current weather conditions where you are. Weather Underground has a REST API call that will convert Lat/Long to City/State (there are other APIs that will do this as well, please post them). Once I have City and State I use that to get current weather conditions where this device is.
Apache NiFi Flow
Augment Live Data
: create two variables (latitude and longitude) by extracting JSON fields from the flow file sent via MQTT. $.latitude and $.longitude
: Let Nifi decide what this AVRO record should look like. I will add a part three using the Schema Registry and Apache NiFi 1.2 that will use a schema lookup.
: And make it SNAPPY!
MergeContent: As AVRO.
MergeContent: As AVRO.
: ORC is the optimal choice for Hive LLAP and Spark SQL to query the data.
: Point to your configuration Nifi relative file system file /etc/hadoop/conf/core-site.xml. I like to set for replace conflict resolution and set your directory. Make sure you created your file directory and Nifi has permissions to
The quickest is to log into a machine with HDFS client.
Weather Underground has a free API that can use up to a certain # of calls. You will easily blast past that if you are tracking live objects, even if you decide to only enrich when points change or at 15 minute intervals.
For testing, you may want to store weather underground results in JSON files in a directory and use GetFile to stand in for those calls. The same with data from your devices. Download the files from the Data Provenance and you
can use them as stand-ins for live data for integration testing, especially when you may be on a plane or somewhere where you cannot reach your device.