I want to do a poc for How-to: Do Real-Time Log Analytics with Apache Kafka and Cloudera Search,i have a around 200 files of real time log data in 20 different servers,,Below are my questions:-
1)What should be the approach, i mean to pull data from these 20 servers?
2)How can i CONFIGURE Cloudera CDH 5 Virtual Machine to integrate kafka with these log servers to build a map reduce task?