From Edge to AI: This tutorial is designed to help you walk through the process of creating a work flow to read data from a edge sensors (Raspberry Pi 3 here) and ingest in Hive through NiFi workflow. The MiNiFi flow will push data to a remote NiFi instance which then using Hive streaming features will ingest data to Hive tables. Once ingested, it can be analyzed with hive sql interface, Spark, zeppelin and by numerous other tools designed for Hadoop platform.
In the whole process we will be going through the installation and configuration of below working components.
NiFi - Apache NiFi commonly called Niagara Files, is an integrated data logistics platform for automating the movement of data between disparate systems. It provides real-time control that makes it easy to manage the movement of data between any source and any destination. It is data source agnostic, supporting disparate and distributed sources of differing formats, schemas, protocols, speeds and sizes such as machines, geo location devices, click streams, files, social feeds, log files and videos and more. More details here: Apache doc, Hortonworks doc
MiNiFi - MiNiFI is a subproject of NiFi designed to solve the difficulties of managing and transmitting data feeds to and from the source of origin, often the first/last mile of digital signal, enabling edge intelligence to adjust flow behavior/bi-directional communication. Since the first mile of data collection (the far edge), is very distributed and likely involves a very large number of end devices (ie. IoT), MiNiFi carries over all the main capabilities of NiFi, with the exception of immediate command and control, creating a design and deploy paradigm that make uniform management of a vast number of devices more practical. It also means that MiNiFi has a much smaller footprint than NiFi, with a range less than 40MB, depending which option is selected – MiNiFi with the Java Agent, or a Native C++ agent. More details here: Apache doc, Hortonworks doc