Code Repositories
Find and share code repositories
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
Repo Description

Here is a new Zeppelin notebook, part of the Hortonworks Gallery on Github, which can be used as a template for analysing web server log files using Spark and Zeppelin. This notebook was ported from an original Jupyter notebook that was part of an EDX online course: "Introduction to Apache Spark", sponsored by Databricks. It is written using "pyspark", the Python interpreter for Spark.

You can import this notebook into your own instance of Zeppelin using the "Import Note" button on the home page. Then use the URL below add paste it into the "Add from URL" box.

Here is the URL link to the actual Zeppelin notebook (note.json) on hortonworks-gallery:

https://github.com/hortonworks-gallery/zeppelin-notebooks/blob/master/2BXSE1MV8/note.json

Here is the link to view the notebook on Zeppelin Hub:

ZeppelinHub Notebook

The source data is an actual HTTP Web Server log taken from the NASA Apollo website.

Repo Info
Github Repo URL https://github.com/hortonworks-gallery/zeppelin-notebooks
Github account name hortonworks-gallery
Repo name zeppelin-notebooks
2,280 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎09-15-2016 08:35 PM
Updated by:
 
Contributors
Top Kudoed Authors