I am new to the world of Hadoop and Horton, and I am trying to build a solution that rivals Splunk and Graylog (we dont have the $ to buy or implement those solutions, and they dont meet out analytics needs).
I have to ingest log/netflow data from Network devices, Linux and Windows hosts into my solution, and then perform analytics on top of it. Types of analysis i need to perform on this log data - Malware analysis, user Behavior modeling, detecting errors in OS logs, etc.
So far, i have created a cluster of RHEL VMs. I am installing HDP on there using Ambari.
What other components will i need to complete this solution? documents and forums point to Apache Nifi, Apache Metron, etc. But i am not sure how all these components will fit into my solution.
Any advice is appreciated. Thank you all for reading this.
I would recommend starting with a Metron development environment for getting started and understanding the concepts. The docs are here: