Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Configure Cloudera CDH 5 for Real time Web logging dashboard analytics in HUE

avatar
Contributor

I want to do a poc for How-to: Do Real-Time Log Analytics with Apache Kafka and Cloudera Search,i have a around 200 files of real time log data in 20 different servers,,Below are my questions:-

 

1)What should be the approach, i mean to pull data from these 20 servers?


2)How can i CONFIGURE Cloudera CDH 5 Virtual Machine to integrate kafka with these log servers to build a map reduce task?

1 ACCEPTED SOLUTION

avatar
Master Collaborator

doing some quick searching, this blog seems to be doing what I think is your intent, taking logs, storing in kafka, distributing to various consumers, one of those consumers being Cloudera Search (solr) [1]  you could make it simpler and store directly to solr[2] if you aren't planning on consuming the same data from multiple sources.  instead of logstash you could also use Flume [3] [4] as well. 

 

[1]https://www.elastic.co/blog/logstash-kafka-intro

[2]https://github.com/lucidworks/solrlogmanager

[3]http://www.cloudera.com/documentation/archive/search/1-3-0/Cloudera-Search-User-Guide/csug_flume_sol...

[4]http://blog.cloudera.com/blog/2014/11/flafka-apache-flume-meets-apache-kafka-for-event-processing/

View solution in original post

1 REPLY 1

avatar
Master Collaborator

doing some quick searching, this blog seems to be doing what I think is your intent, taking logs, storing in kafka, distributing to various consumers, one of those consumers being Cloudera Search (solr) [1]  you could make it simpler and store directly to solr[2] if you aren't planning on consuming the same data from multiple sources.  instead of logstash you could also use Flume [3] [4] as well. 

 

[1]https://www.elastic.co/blog/logstash-kafka-intro

[2]https://github.com/lucidworks/solrlogmanager

[3]http://www.cloudera.com/documentation/archive/search/1-3-0/Cloudera-Search-User-Guide/csug_flume_sol...

[4]http://blog.cloudera.com/blog/2014/11/flafka-apache-flume-meets-apache-kafka-for-event-processing/