Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Expert Contributor

Brandon Wilson has a great article that shows how to use the "CACHE TABLE" cmd in Tableau, however more recent drivers have come out and you can now connect directly to the thriftserver using a spark-sql driver. This is using HDP 2.5 and SimbaSparkOdbc.

First pull up a Tableau connection and select the thriftServer. Additionally had to open the virtualbox port 10015.

6918-thrift-server-connect.png

Next if you don't have the driver Tableau will jump you to a page where you can download a spark-sql driver and inside that package chose this driver.

6919-driver.png

Once you establish a valid connection you will see Tableau flag the connects based on the driver. Below you will see the Hive connection from Brandon's article and now the new Spark connection.

6920-hive-vs-spark.png

Next using the CACHE cmd enter the below into Tableau's initial SQL box.

6932-cache.png

Finally check the storage of spark for the warehouse/crimes table in memory. Or any table of your chosing for that matter.

6931-spark-storage.png

Some visuals from Tableau.

6933-crime-per-location.png

6934-crime-district.png

6935-crime-weapon.png

6936-crime-trend.png

2,600 Views