Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Guru

Some basic charts are already included in Zeppelin. Visualizations are not limited to SparkSQL's query, any output from any language backend can be recognized and visualized. This sufficient for a Data Analyst, but if you were a Data Scientist, the built in tools within Zeppelin just don't cut.

If you are a Python programmer and have been working with data in IPython, you must definitely be well versed in matplotlib library.

You can install matplotlib in your python shell as long as Zeppelin is still using the same version of python. Once you do that the Pyspark interpreter will be able to import matplotlib libraries and you will be able to create graphs and charts in the Zeppelin interface itself.

Make sure to add the following code at the end of your script and modify it as needed:

def show(graph):
  img = StringIO.StringIO()
  graph.savefig(img, format='svg')
  img.seek(0)
  print "%html <div style='width:1000px'>" + img.buf + "</div>"


show(plt)
2,193 Views