Created on 10-14-201607:16 PM - edited 08-17-201908:50 AM
Sometimes you have Java messages that you would like to easily ingest into HDFS or perhaps HDFS as raw files, Phoenix, Hive and other destinations. You can do that pretty easy with Apache NiFi 1.0.0 as part of HDF 2.0. For this simple example, I also added a REST gateway for bulk loading, testing and to provide another way to easily send JMS messages. ListenHTTP accepts HTTP POSTS on port 8099, which I made the listener port for that processor. It takes what you send and publishes that to a JMS queue. I am using ActiveMQ.
I have a little Python 2.7 script that I found on github that makes fake log records and modified it to send 1,000 JSON messages via REST to our REST to JMS gateway in NIFI for testing. You can easily do this with shell script and CURL, Apache JMeter, Java code, Go script and many other open source REST testers and clients.
I installed an ActiveMQ JMS broker as my example JMS server, which is very simple on Centos 7. All you need to do is download the gziped tar and untar it. It's ready to run with a chmod. That download also includes the client jar that we will need on the HDF 2.0 server for accessing the message queue server. You must also have the port open. On ActiveMQ that defaults to 61616. ActiveMQ also includes a nice web console that you may want to unblock that port for viewing the status of queues and messages. In my simple example, I am running JMS via: bin/activemq start > /tmp/smlog 2>&1 &;
I recommend changing your HTTP Listening Port, so you can run a bunch of these processors as needed.
Processors used: ConsumeJMS, MergeContent and PutHDFS.
You need to set Destination Name which is the name of the QUEUE in this case, but could also be the name of the Topic. I picked Destination Type of QUEUE since I am using a QUEUE in Apache ActiveMQ.
It's very easy to add more output processors for sinking data into Apache Phoenix, HBase, Hive, Email, Slack and other NoSQL stores. It's also easy to convert messages into AVRO, ORC and other optimized big data file formats.
As you see we get a number of jms_ attributes including priority, message ID and other attributes associated with the JMS message.