Reply
Highlighted
Contributor
Posts: 27
Registered: ‎04-09-2016
Accepted Solution

Configure Cloudera CDH 5 for Real time Web logging dashboard analytics in HUE

I want to do a poc for How-to: Do Real-Time Log Analytics with Apache Kafka and Cloudera Search,i have a around 200 files of real time log data in 20 different servers,,Below are my questions:-

 

1)What should be the approach, i mean to pull data from these 20 servers?


2)How can i CONFIGURE Cloudera CDH 5 Virtual Machine to integrate kafka with these log servers to build a map reduce task?

Expert Contributor
Posts: 101
Registered: ‎01-24-2014

Re: Configure Cloudera CDH 5 for Real time Web logging dashboard analytics in HUE

doing some quick searching, this blog seems to be doing what I think is your intent, taking logs, storing in kafka, distributing to various consumers, one of those consumers being Cloudera Search (solr) [1]  you could make it simpler and store directly to solr[2] if you aren't planning on consuming the same data from multiple sources.  instead of logstash you could also use Flume [3] [4] as well. 

 

[1]https://www.elastic.co/blog/logstash-kafka-intro

[2]https://github.com/lucidworks/solrlogmanager

[3]http://www.cloudera.com/documentation/archive/search/1-3-0/Cloudera-Search-User-Guide/csug_flume_sol...

[4]http://blog.cloudera.com/blog/2014/11/flafka-apache-flume-meets-apache-kafka-for-event-processing/