Community Articles

Find and share helpful community-sourced technical articles.
avatar
Super Guru

This tutorial is part two of a two-part series. In this tutorial, we'll verify Spark 2.1 functionality using Zeppelin on an HDP 2.6 cluster deployed using Cloudbreak. The first tutorial covers using Cloudbreak to deploy the cluster. You can find the first tutorial here: HCC Article

Prerequisites

  • You should already have completed part one of this tutorial series and already have an Cloudbreak HDP 2.6 with Spark 2.1 cluster running.

Scope

This tutorial was tested in the following environment:

  • Cloudbreak 1.14.4
  • AWS EC2
  • HDP 2.6
  • Spark 2.1
  • Zeppelin 0.7

Steps

Login into Ambari

As mentioned in the prerequisites, you should already have a cluster built using Cloudbreak. Click on the cluster summary box in the Cloudbreak UI to display the cluster details. Now click on the link to your Ambari cluster. You may see something similar to this:

15724-security-warning-1.png

Your screen may vary depending on your browser of choice. I'm using Chrome. This warning is because we are using self-signed certificates which are not trusted. Click on the ADVANCED link. You should see something similar to this:

15725-security-warning-2.png

Click on the Proceed link to open the Ambari login screen. You should be able to login to Ambari using the username and password admin.

Login to Zeppelin

Now click on the Zeppelin component in the component status summary. You should see something similar to this:

15726-zeppelin-summary.png

Click on the Quicklinks link. You should see something similar to this:

15727-zeppelin-quicklink.png

Click on the Zeppelin UI link. This will load Zeppelin in a new browser tab. You should see something similar to this:

15728-zeppelin-login-1.png

You should notice the blue Login button in the upper right corner of the Zeppelin UI. Click on this button. You should see something similar to this:

15729-zeppelin-login-2.png

You should be able to login to Zeppelin using the username and password admin. Once you login, you should see something similar to this:

15730-zeppelin-login-3.png

Load Getting Started Notebook

Now let's load the Apache Spark in 5 Minutes notebook by clicking on the Getting Started link. You should see something similar to this:

15731-zeppelin-notebook-1.png

Click on the Apache Spark in 5 Minutes notebook. You should see something similar to this:

15732-zeppelin-notebook-2.png

This is showing you the Zeppelin interpreters associated with this notebook. As you can see, the spark2 and livy2 interpreters are enabled. Click the blue Save button. You should see something similar to this:

15733-zeppelin-notebook-3.png

This notebook defaults to using the Spark 2.x interpreter. You should be able to run the paragraphs without any changes. Scroll down the the notebook paragraph called Verify Spark Version. Click the play button on this paragraph. You should see something similar to this:

15734-zeppelin-notebook-4.png

You should notice the Spark version is 2.1.0.2.6.0.3-8. This confirms we are using Spark 2.1. It also confirms that Zeppelin is able to properly interact with Spark 2 on our HDP 2.6 cluster built with Cloudbreak. Try running the next two paragraphs. These paragraphs download a json file form github and then moves it to HDFS on our cluster. Now run the Load data into a Spark DataFrame paragraph. You should see something similar to this:

15735-zeppelin-notebook-5.png

As you can see, the DataFrame should be properly loaded from the json file.

Next Steps

Try running the remaining paragraphs to ensure everything is working ok. For an extra challenge, try running some of the other Spark 2 notebooks that are included. You can also attempt to modify the Spark 1.6 notebooks to work with Spark 2.1.

15736-zeppelin-notebook-6.png

Review

If you have successfully followed along with this tutorial, you should have been able to confirm Spark 2.1 works on our HDP 2.6 cluster deployed with Cloudbreak.

2,490 Views
Comments
avatar
New Contributor

When I try to launch Zeppellin UI, i get this error:

error.jpg