Community Articles
Find and share helpful community-sourced technical articles
Cloudera Employee


It's been a while, but I'm finally finishing this series on Best Mode Quotient. This article is fairly straight forward as we look how to build the infrastructure of model training using Zeppelin and Spark.


In order to run this install, you will have to have deployed a BMQ ephemeral cluster as detailed in this article and this repository.

Moreover you will have to have gathered data on your fitness as explained in part 1 of these series. Based on this data, I created 3 indexes:

  • Intensity Index: How intense was the workout I did that day (based on distance, pace and elevation)
  • Fatigue Index: How much rest I had on that day (based on hours of sleep and rest heart rate)
  • BMQ Index: How my BMQ fairs compare to the max BMQ (5)

You can then agglomerate these 3 indexes, having BMQ and Fatigue on the same day correlating to the Intensity Index of the previous day. All data will be shared in the last article of the series.

You will also notice that the model uses parameterized sleep and rest HR. The whole flow is to be revealed in part 4


This tutorial is divided in the following sections:

  • Section 1: Create a mysql interpreter for JDBC in Zeppelin
  • Section 2: Create training set for BMQ prediction
  • Section 3: Create a prediction model
  • Section 4: Save the results in a table

Section 1: Create a mysql interpreter for JDBC in Zeppelin


Login to Zeppelin, then go to top right corner > Admin > Interpreter and edit the jdbc interpreter.

Add the following parameters:

  • mysql.driver: com.mysql.jdbc.Driver
  • mysql.url: jdbc:mysql://localhost:3306/beast_mode_db
  • mysql.user: bmq_user
  • mysql.password: Be@stM0de
  • Add artifact: mysql:mysql-connector-java:5.1.38

Restart the interpreter and you should be good to go.

Section 2: Create training set for BMQ prediction

Create a new note called BMQ Predictions and add the following code, using the jdbc interpreter you just built.

Delete existing training tables if any


drop table if exists training_set

Create training set based on fatigue and intensity indexes


create table training_set as (
select @rowid:=@rowid+1 as rowid,, bmq_index, fatigue_index, intensity_index 
from bmq_index, fatigue_index, intensity_index, (select @rowid:=0) as init
where =
and date_sub(, INTERVAL 1 DAY) =
order by asc)

View Data


select * from training_set

Delete existing prediction tables if any


drop table if exists prediction

Create a table we want to apply the algo against


create table prediction as (
select date(training_date) as date, estimated_intensity_index,
(1-((select (sleep_hours*60) from PREDICTION_PARAMETERS)/(select max(TOTAL_MINUTES_ASLEEP) from SLEEP_HISTORY)))*0.6 +
(1-((select min(REST_HR) from HEALTH_HISTORY)/(select rest_hr from PREDICTION_PARAMETERS)))*0.4
) *100,2) as estimated_fatigue_index,
0.0 as predicted_bmq
from training_plan)

View Data


select * from prediction

Section 3: Create a prediction model

DISCLAIMER: This model needs to be worked on; the purpose of this article is to establish the principal architecture, not give the final most tuned model as I plan on improving on it.

This part uses the spark interpreter of Zeppelin to vectorize, normalize and train a model

Create dataframe from MySQL tables:

import org.apache.spark.sql.SQLContext

val sqlcontext = new org.apache.spark.sql.SQLContext(sc)

val df ="jdbc").option("url", "jdbc:mysql://localhost:3306/beast_mode_db").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "training_set").option("user", "bmq_user").option("password", "Be@stM0de").load()

val target_df ="jdbc").option("url", "jdbc:mysql://localhost:3306/beast_mode_db").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "prediction").option("user", "bmq_user").option("password", "Be@stM0de").load()

Vectorize Dataframes

val assembler1 = new VectorAssembler().
  setInputCols(Array( "fatigue_index","intensity_index")).

val assembler2 = new VectorAssembler().
  setInputCols(Array( "estimated_fatigue_index","estimated_intensity_index")).

Normalize Dataframes

val normalizer = new Normalizer()

val targetNormalizer = new Normalizer()

Train and evaluate Model


val Array(trainingData, testData) = normalizer.randomSplit(Array(0.7, 0.3))

val lr = new LinearRegression()
val lrModel =
lrModel.transform(testData).select("features","normFeatures", "bmq_index", "prediction").show()

println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

val trainingSummary = lrModel.summary
println(s"numIterations: ${trainingSummary.totalIterations}")
println(s"objectiveHistory: [${trainingSummary.objectiveHistory.mkString(",")}]")
println(s"RMSE: ${trainingSummary.rootMeanSquaredError}")
println(s"r2: ${trainingSummary.r2}")

val targetTable = lrModel.transform(targetNormalizer).select("date", "estimated_intensity_index", "estimated_fatigue_index",  "prediction")

Section 4: Save the results in a table

Finally, we will take the results of the prediction and put them in a table called BMQ_PREDICTIONS for later use:

Rename the dataframe to match target columns name

val newNames = Seq("date", "estimated_intensity_index", "estimated_fatigue_index", "predicted_bmq")
val targetTableRenamed = targetTable.toDF(newNames: _*)

Delete target table if exists


drop table if exists BMQ_PREDICTIONS

Write data


val prop = new java.util.Properties
prop.setProperty("driver", "com.mysql.jdbc.Driver")
prop.setProperty("user", "bmq_user")
prop.setProperty("password", "Be@stM0de") 

targetTableRenamed.write.mode("append").jdbc("jdbc:mysql://localhost:3306/beast_mode_db", "BMQ_PREDICTIONS", prop)

View data


select * from BMQ_PREDICTIONS


Don't have an account?
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 04:52 AM
Updated by:
Top Kudoed Authors