1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2465 | 04-03-2024 06:39 AM | |
| 3812 | 01-12-2024 08:19 AM | |
| 2055 | 12-07-2023 01:49 PM | |
| 3042 | 08-02-2023 07:30 AM | |
| 4178 | 03-29-2023 01:22 PM |
04-28-2017
09:39 PM
3 Kudos
People were asking me when you could throw some more acronyms into an article. So I did that. iBeacons are tiny devices that are broadcasting over Bluetooth Low Energy (BLE) that are very good for advertising and for broadcasting proximity. I have been thinking of the retail space so I purchased a beacon, got a beacon simulator on my phone and found a really cool gateway device. It reads the BLE transmissions from these devices, packages them as JSON MQTT messages and sends them over WIFI to a MQTT Broker of your choice. I am using a cloud hosted MQTT broker to hold these and then Apache NiFi subscribes to these messages. This can happen amazingly fast as MQTT is very light weight, NIFI is fast and BLE is fast. These tiny messages are constantly advertising their identity and other information. With multiple beacons in a building, store, room, facility; you can see who is close to where. As you might expect there's multiple standards for this, iBeacon is Apple's and Google Eddystone. Fortunately this gateway works with both and so can you. Walk into the Physical Web. Use Cases Happy Bubbles Technology In Apache NiFi, it doesn't matter the data or location, it's easy as pie to ingest Beacon data via MQTT and store it in ORC files for Hive table queries. Easy to visualize and analyze the data in Apache Zeppelin against these Hive tables. What I like about this gateway from Happy Bubble technology besides the awesome Hippo logo is that this gateway is dead simple to setup. Plug it in, wait a bit, connect to it's WIFI, set WIFI and MQTT settings. Then you are running! They also provide an open source server you can use to see the data. If BLE messages are ibeacon format, they will get one JSON format. If they are Google Eddystone format it's a second one. And if it's something else, it will get a generic format. I have three listeners in NiFi to grab them all.
Example Messages happy-bubbles-ble
{"rssi":-63, "heap_free":27912}
{"hostname": "happy-bubbles-ble",
"mac": "fa909c522836",
"rssi": -51,
"is_scan_response": "1",
"type": "0",
"data": "020a000816f0ff640000000011094d696e69426561636f6e5f3331303033"}
{"hostname": "happy-bubbles-ble",
"beacon_type": "ibeacon",
"mac": "fa909c522836",
"rssi": -51,
"uuid": "e2c56db5dffb48d2b060d0f5a71096e0",
"major": "0000",
"minor": "0000",
"tx_power": "c5"}
Phones There's a great free application for IPhone called Locate Beacon that let's you simulate beacons. This is great for testing. For the inexpensive beacon from Happy Bubbles I bought it is a minew MS49_nrf51822 and there's an app for that called BeaconSET. This will let you set properties and see it over your phone's BlueTooth. DDL CREATE EXTERNAL TABLE IF NOT EXISTS beaconstatus (rssi INT, heap_free INT)
STORED AS ORC LOCATION '/beacons';
CREATE EXTERNAL TABLE IF NOT EXISTS ibeacon (hostname STRING, beacon_type STRING, mac STRING, rssi INT, uuid STRING, major STRING, minor STRING, tx_power STRING)
STORED AS ORC LOCATION '/beacons/ibeacon';
CREATE EXTERNAL TABLE IF NOT EXISTS beacongateway (hostname STRING, mac STRING, rssi INT, is_scan_response STRING, type STRING, data STRING)
STORED AS ORC LOCATION '/beacons/gateway';
Reference https://en.wikipedia.org/wiki/IBeacon https://en.wikipedia.org/wiki/Bluetooth_low_energy http://developer.estimote.com/ibeacon/tutorial/part-3-ranging-beacons/ https://en.wikipedia.org/wiki/MQTT https://developers.google.com/beacons/proximity/guides https://www.beaconzone.co.uk/BeaconTriggerDataAndServers https://developer.apple.com/ibeacon/ https://developers.google.com/beacons/ https://www.happybubbles.tech/presence/detector https://www.happybubbles.tech/presence/docs/setup/ https://github.com/happy-bubbles/ https://itunes.apple.com/us/app/locate-beacon/id738709014?mt=8
... View more
Labels:
04-28-2017
02:58 AM
i have not tried, but there's no reason you couldn't. and you could send it to a remote cluster via site 2 site
... View more
04-02-2017
09:16 PM
2 Kudos
The QueryDatabaseTable processor can easily ingest data from a table based on a incrementing key. A sequence id or primary key that is autogeneratored like Postgresql and MariaDB do is ideal. You can also do an incrementing data or Oracle Sequence ID. As long as it increments when you get a new one you can set. If your tables don't this, you could write a trigger or procedure in your database that sends it to a transaction table with such an autogenerated id and NiFi will grab that. Clearly real CDC involves reading Write Ahead Logs or Transaction logs at a deep level and grabbing all changes. That is coming and can now be done by tools like Atunity + NiFi. For use cases that I have, I just need to grab new rows when they are added to a table and I control the ID. I convert from AVRO to JSON so I can extract attributes since I want to do some routing based on column values. Based on one field in the table, I want to determine where I land the data. It can be sent to HBase (and Phoenix), HDFS or Hive. I split my records for easy processing. One thing you I highly recommend you do for SQL safety and to prevent errors. Example SQL for CDC: upsert into trials (trialid, trialdescription, fileName) values (1,'FENTANYL','5ab2d068-dd53-4674-bcf8-17f7d80d0553')
CREATE EXTERNAL TABLE IF NOT EXISTS trials2 (trialid INT, trialdescription STRING, trialtype STRING) STORED AS ORC
location '/hiveorc'
CREATE TABLE trials (trialid integer not null primary key, trialdescription varchar, filename varchar);
Set your SQL Attributes for SQL Safety. The types are the numeric values for JDBC Types. 12 is String. -5 is BIG INT. Then your SQL is standard JDBC syntax with ?'s for place markers. Here is some cool data: I used Google Location API called via NiFi REST CALL to enhance some data and get lat and long from a vague location. This kind of thing happens in Twitter all the time.
Reference: https://www.mockaroo.com/ https://community.hortonworks.com/articles/51902/incremental-fetch-in-nifi-with-querydatabasetable.html
... View more
Labels:
04-02-2017
08:58 PM
6 Kudos
Monitoring Apache NiFi It's really important to pick some Reporting Tasks to let you know what's happening in Apache NiFi servers. Ambari will send it to your HDF Ambari which will show the results in nice Grafana graphs, charts and tables. You can also monitor disk usage, memory and also send tasks to DataDog, Ganglia and Other Servers. It's also easy to write your own Reporting Task if you need a different one. One of the ways to monitor your Apache NiFi Data Flows is to use the MonitorActivity processor which will create messages that can be sent to your Operations Dashboard, Console or elsewhere. For people doing ChatOps, you can easily push these messages to Slack (there's a processor for that) PutSlack. You could also send a REST call to HipChat or other chat tools. Pretty easy to wrap that up in a custom processor as well. Other Things to Monitor REST END Points server:port/nifi-api/system-diagnostics See: https://nifi.apache.org/docs/nifi-docs/rest-api/ Logs ...nifi/logs/nifi-app.log and ..nifi/logs/nifi-user.log These can be ingested with Apache NiFi for detailed log processing. You can filter and send some messages to SumoLogic or elsewhere via Apache NiFi. See: https://community.hortonworks.com/content/kbentry/67309/routing-logs-through-apache-nifi-to-phoenix-hdfs-a.html
... View more
Labels:
03-31-2017
09:39 PM
6 Kudos
FlowFile Continuation
Sometimes you need to backup your current running flow, let that flow run at a later date, or make a backup of what is in process now. You want this in a permanent storage and want to reconstitute it later like Orange Juice. And add it back into the flow or restart it. This could be do to failures, for integration testing, for testing new versions of components, as a checkpoint or for many other purposes. You don't always want to reprocess the original source or files (they may be gone). Option 1: You can save that raw data that came in originally in local files or HDFS. Then read it out of there later. Option 2: Preferred: MergeContent to FlowFileV3 then Reload with Get* to IdentifyMimeType to UnpackContent Using MergeContent with FlowFileV3 option. After that step you can PutFile, PutS3Object, PutHDFS or other file saving options. Perhaps send it to an FTP or sFTP server for storage elsewhere. Now you have a pkg file. cat /opt/demo/flow/904381478117605.pkg
NiFiFF3+tempf73.02sql.args.2.value29.7sql.args.11.type3roll353.9306742667328
mqtt.brokertcp://m13.cloudmqtt.com:14162sql.args.4.type3uuid$9f2f8b6f-2870-40a3-a460-49427cddf9a8
mqtt.topicsensorsql.args.7.type3sql.args.7.value353.9306742667328path./sql.args.4.value33.9sql.args.9.value-0.0sql.args.1.type1humidity29.7pitch14.015266431562901
nf.file.path.mqtt.qos0sql.args.8.type3temp33.9sql.args.1.value34sql.args.2.type3sql.args.10.type3sql.args.8.value128.4983979122009sql.args.5.type3sql.args.6.value14.015266431562901sql.args.3.value1011.1sql.args.10.value-0.0mqtt.isDuplicatefalspressure1011.1mqtt.isRetainedfalseyaw128.4983979122009cputemp3filename904381478117605sql.args.11.value1.0sql.args.9.type3x-0.0y-0.0z1.0sql.args.6.type3
nf.file.name904381478117605sql.args.5.value73.02sql.args.3.type3�[{"tempf": 73.02, "pressure": 1011.1, "pitch": 14.015266431562901, "temp": 33.9, "yaw": 128.4983979122009, "humidity": 29.7, "cputemp": "34", "y": -0.0, "x": -0.0, "z": 1.0, "roll": 353.9306742667328}]%
You can now reload that FlowFileV3 at any time, send it to IdentifyMimeType (so it knows it's a FlowFileV3) and then use UnpackContent to reconstitute into the original flow file. Now you can use it like it never stopped and was sent to disk. Now you have an unlimited queue to store pre or partially processed files. Saving time! You could run really expensive processes once and save the preprocessed items, files or models and reuse everywhere! Choose: FlowFile Stream, v3 Thanks to Joe Witt for explanation of the process. Reference: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.UnpackContent/ https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/
... View more
Labels:
03-24-2017
05:57 PM
have you connected to data from SQL?
... View more
03-19-2017
04:37 PM
6 Kudos
Phone Tracking with OwnTracks and Apache NiFi 1.x OwnTracks is an Open Source project which provides an iOS and an Android app with which your smartphone records its current location. I installed the OwnTracks application for iOS and it let's you specify your own REST JSON server to receive calls. I added Apache NiFi. It can also send via MQTT direct to mosquittio on prem to NIFI or to CloudMQTT. You just need to enter your Apache NiFi address and port.
Tell NiFi to listen for HTTP on port 9179 for the phone push, allow it to use GET, POST and PUT. No coding required. Respond to the phone with HTTP Status Code 200 and use the context map to connect the HTTP flow. We pull out the attributes from the JSON Flow File. Store our Phone Data in Apache Phoenix on HBase upsert into phone (uuid,battery,longitude ,accelerator ,velocity,vac ,latitude , tvalue ,connection , tst , altitude , messagetype , tid, httpremotehost, useragent, filename, datetime)
values ('${'uuid'}','${'battery'}','${'longitude'}','${'accelerator'}','${'velocity'}',
'${'vac'}','${'latitude'}',
'${'tvalue'}','${'connection'}','${'tst'}','${'altitude'}','${'messagetype'}','${'tid'}',
'${'http.remote.host'}','${'http.headers.User-Agent'}','${'filename'}','${now()}')
Results in Zeppelin Reference http://owntracks.org/ http://owntracks.org/booklet/ http://owntracks.org/booklet/tech/json/ http://osmand.net/build_it https://diaspod.de/posts/156379 http://owntracks.org/booklet/tech/http/ http://owntracks.org/booklet/tech/json/ https://itunes.apple.com/us/app/mqttitude/id692424691?mt=8
... View more
Labels:
03-17-2017
09:47 PM
4 Kudos
IoT Working with IoT data is a many layered process, not unlike a parfait. Scratch that, an Onion. In fact, an Onion Omega2, which is a great device that I just got yesterday does IoT really easily. This is so much easier to setup that RPI or other platforms. It also has a ton of pluggable modules that stack on top of this small chip. It's pretty small powered, but it's under $10. The device is extremely well documented at their site. Needed to run real tools, I added a USB stick and used that for storage and for extra SWAP space. opkg update
opkg install kmod-usb-storage-extras e2fsprogs kmod-fs-ext4
umount /tmp/mounts/USB-A1
mkfs.ext4 /dev/sda1
mkdir /mnt/sda1
mount /dev/sda1 /mnt/sda1
mount /dev/sda1 /mnt/ ; tar -C /overlay -cvf - . | tar -C /mnt/ -xf - ; umount /mnt/
opkg update
opkg install block-mount
opkg update
opkg install swap-utils block-mount
dd if=/dev/zero of=/tmp/mounts/USB-A1/swap.page bs=1M count=256
mkswap /tmp/mounts/USB-A1/swap.page
swapon /tmp/mounts/USB-A1/swap.page
free
block detect > /etc/config/fstab
Adding GPS ls /dev/ttyACM*
cat /dev/ttyACM0
opkg update
opkg install ogps
ubus list
/etc/init.d/rpcd restart
ubus call gps info
Using the GPS Expander, this is the JSON data returned from the utility: {"age":0,"latitude":"40.2807","longitude":"-74.6418","elevation":"38.4","course":"","speed":"N"} I then added Python and Paho MQTT Python Client for sending messages to my Cloud MQTT broker. opkg install python
https://docs.onion.io/omega2-docs/installing-and-using-python.html#onion-python-modules
opkg install python-pip
pip install --upgrade setuptools
pip install paho-mqtt
crontab -e
/etc/init.d/cron restart
*/1 * * * * /opt/demo/run.sh
Once the data was sent a MQTT broker, it was easy to ingest with Apache NiFi. I ingest MQTT messages from the broker Extra Fields from the JSON File Format some parameters for SQL build my SQL string then upsert into Phoenix/HBase. This is the beautiful Web Console that comes prerunning on the tiny Onion Omega2 device. Report from the Table in Apache Zeppelin
Reference https://docs.onion.io/omega2-docs/first-time-setup.html https://docs.onion.io/omega2-docs/expansion-dock.html https://docs.onion.io/omega2-docs/connecting-to-the-omega-terminal.html#connecting-to-the-omega-terminal-ssh https://docs.onion.io/omega2-docs/gps-expansion.html https://docs.onion.io/omega2-docs/using-gps-expansion.html#using-gps-expansion https://github.com/OnionIoT/onion-gpio-sysfs/tree/master/python/examples https://github.com/OnionIoT/Onion-Sensor-Server https://lede-project.org/docs/guide-quick-start/start https://docs.onion.io/omega2-docs/boot-from-external-storage.html https://wiki.onion.io/Tutorials/Extending-RAM-with-a-swap-file https://docs.onion.io/omega2-docs/extending-omega-memory.html
... View more
Labels:
03-14-2017
02:12 PM
Build your MicroSD card with www.pibakery.org it lets you pre configure boot, wifi, ...
... View more
03-12-2017
04:53 PM
2 Kudos
There are two great additions you can make to your current Hive. The first is HPL/SQL that brings stored procedure programming to the Hadoop world. The second is Hive Mall which brings advanced functions and machine learning to your Hive queries. HPL/SQL HPL/SQL is included in Hive 2.0 and will be included in Hive 2.1 on HDP 2.6.
You can manually download and install it now.
It is Hybrid Procedural SQL on Hadoop. For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL and TSQL code over to Hive. This gives you another interface to Hive and Hadoop, it will be included in future Hadoop and be tied into the very fast Hive LLAP 2.1. HPL/SQL https://community.hortonworks.com/content/idea/43847/hplsql-make-sql-on-hadoop-more-dynamic.html http://www.hplsql.org/connections http://www.hplsql.org/cli http://www.hplsql.org/download http://www.hplsql.org/start To Run A Stored Procedure cd hplsql-0.3.17
./hplsql -f proc.pl HP/SQL Stored Procedure Example create procedure fn_test1 (VarOne char(25))
BEGIN
SET plhql
execute immediate 'set hive.exec.dynamic.partition.mode=nonstrict';
execute immediate 'set hive.exec.dynamic.partition=true';
execute immediate 'SET hive.execution.engine=tez';
print VarOne;
set VarOne = Upper(VarOne);
if (VarOne not in ('STUFF', 'STUFF2'))
BEGIN
print 'Bad Data';
RETURN -1; END
print 'Good Data';
END;
print call fn_test1('STUFF');
./hplsql -f proc.pl
Call
17/03/09 20:04:03 INFO jdbc.Utils: Supplied authorities: localhost:10000
17/03/09 20:04:03 INFO jdbc.Utils: Resolved authority:
localhost:10000
Open connection: jdbc:hive2://localhost:10000 (266 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (1 ms)
*STUFF*
Good Data Apache HiveMall HiveMall was developed by developers from Treasure Data, NTT and Hortonworks. https://community.hortonworks.com/articles/67983/apache-hive-with-apache-hivemall.html https://www.slideshare.net/HadoopSummit/hivemall-scalable-machine-learning-library-for-apache-hivesparkpig http://hivemall.incubator.apache.org/userguide/getting_started/permanent-functions.html http://hivemall.incubator.apache.org/userguide/getting_started/installation.html http://github.com/myui/hivemall http://hivemall.incubator.apache.org set hivevar:hivemall_jar=hdfs:///apps/hivemall/hivemall-with-dependencies.jar;
source /opt/demo/define-all-as-permanent.hive; HiveMall is a scalable machine learning library built as a collection of Hive UDFs that you can run through Hive, Spark and Pig. It brings very cool processing to your Hive queries, Zeppelin, Pig and Spark code. You will be able to combine Hive Mall machine learning with stored procedures on in-memory fast LLAP Hive. This is revolutionary. You can run this via near real-time Apache NiFi streams.
... View more
Labels: