Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Solved Go to solution
Highlighted

Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

burman@Bob

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Guru

NiFi is easy at capturing logs. Why not use all technologies where they are best: NiFi to gather log data in realtime -> kafka queue -> Spark streaming analytics -> Zeppelin for spark and visualization. You could also fork NiFi to mergecontent to hdfs to keep for historical analysis.

All technologies come out-of-the-box with HDF and HDP.

View solution in original post

5 REPLIES 5
Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Guru

NiFi is easy at capturing logs. Why not use all technologies where they are best: NiFi to gather log data in realtime -> kafka queue -> Spark streaming analytics -> Zeppelin for spark and visualization. You could also fork NiFi to mergecontent to hdfs to keep for historical analysis.

All technologies come out-of-the-box with HDF and HDP.

View solution in original post

Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Hi Greg,

Thanks for your valuable feedback. I am quite new in this field and recently, I am trying to implement this into my company. I need one more positive feedback on one tutorial that I found as below-

"http://hortonworks.com/hadoop-tutorial/how-to-refine-and-visualize-server-log-data/"

I prefer only Hortonworks tutorial.

Best Regards,

Bob

Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Guru

@Bibhas BurmanThat is an excellent tutorial for pushing log data to HDFS for historical analysis. If you want to do real-time streaming analysis here are two links that should be useful

http://hortonworks.com/hadoop-tutorial/realtime-event-processing-nifi-kafka-storm/ (ignore the storm part)

https://community.hortonworks.com/articles/44550/horses-for-courses-apache-spark-streaming-and-apac.... (integrate with the kafka part from the first link)

Since you are getting your feet wet with the technology, definitely put in some time to play around with it and build small projects before working toward your end product. And of course ... anytime you have a question along the way ask the HCC to get some guidance.

Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

It is very informative and helped me a lot. Thanks.

Highlighted

Re: Q1. Which is the best option in IT industry to analyze Web Server Log : 1. HDF NiFi with Apache Zeppelin OR 2. new Zeppelin notebook using Spark? Q2. Which is cost effective?

Hi Greg,

A simple question, if someone ask me why should use Apche NiFi using Big Data technology to analyze log files? We have microsoft-logparser. What is the advantage of using Big Data technology such as HDF nifi..?

Don't have an account?
Coming from Hortonworks? Activate your account here