Configure Livy in Ambari

Until is fixed, set

livy.server.csrf_protection.enabled ==> false

in Ambari under Spark Config - Advanced livy-conf

Install Sparkmagic

Details see

Install Jupyter, if you don't already have it:

$ sudo -H pip install jupyter notebook ipython

Install Sparkmagic:

$ sudo -H pip install sparkmagic

Install Kernels:

$ pip show sparkmagic # check path, e.g /usr/local/lib/python2.7/site-packages 
$ cd /usr/local/lib/python2.7/site-packages 
$ jupyter-kernelspec install --user sparkmagic/kernels/sparkkernel 
$ jupyter-kernelspec install --user sparkmagic/kernels/pysparkkernel

Install Sparkmagic widgets

$ sudo -H jupyter nbextension enable --py --sys-prefix widgetsnbextension

Create local Configuration

The configuration file is a json file stored under ~/.sparkmagic/config.json

To avoid timeouts connecting to HDP 2.5 it is important to add

"livy_server_heartbeat_timeout_seconds": 0

To ensure the Spark job will run on the cluster (livy default is local), spark.master needs needs to be set to yarn-cluster. Therefore a conf object needs to be provided (here you can also add extra jars for the session):

"session_configs": {
    "driverMemory": "2G",
    "executorCores": 4,
    "executorMemory": "8G",
    "proxyUser": "bernhard",
    "conf": {
        "spark.master": "yarn-cluster",
        "spark.jars.packages": "com.databricks:spark-csv_2.10:1.5.0"

The proxyUser is the user the Livy session will run under.

Here is an example config.json. Adapt and copy to ~/.sparkmagic

Start Jupyter Notebooks

1) Start Jupyter:

$ cd <project-dir> 
$ jupyter notebook

In Notebook Home select New -> Spark or New -> PySpark or New -> Python

2) Load Sparkmagic:

Add into your Notebook after the Kernel started

In[ ]:  %load_ext sparkmagic.magics

3) Create Endpoint

In[ ]:  %manage_spark

This will open a connection widget


Username and password can be ignored in non secured clusters

4) Create a session:

When this is successful, create a session:


Note that it uses the created endpoint and under properties the configuration on the config.json.

When you see


Spark session is successfully started and


  • Livy on HDP 2.5 currently does not return YARN Application ID
  • Jupyter session name provided under Create Session is notebook internal and not used by Livy Server on the cluster. Livy-Server will create sessions on YARN called livy-session-###, e.g. livy-session-10. The session in Jupyter will have session id ###, e.g. 10.
  • For multiline Scala code in the Notebook you have to add the dot at the end, as in
val df =
                option("header", "true").
                option("inferSchema", "true").


Thanks to Alex (@azeltov) for the discussions and debugging session


This is great. And can be even better if you fix the broken links to image. Thanks


Useful for getting SparkMagic to run w/ Jupyter. And the images do not seem to load for me either, still good how-to tech article for Jupyter.