Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Root Cause Analysis using Machine Learning in Spark ML

avatar
Rising Star
 
1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Do you have a target variable that you can predict? Or do you have logic that will allow you to convert a "low" CPU value into a target variable?

Spark has a wide variety of models that are available for classification modeling: https://spark.apache.org/docs/latest/mllib-classification-regression.html

If you are interested in seeing which factor is contributing to a specific instance, I would recommend starting with a logistic regression model as that will provide more explanatory power -- providing more insight into which factor is contributing to a particular CPU failure

View solution in original post

1 REPLY 1

avatar
Cloudera Employee

Do you have a target variable that you can predict? Or do you have logic that will allow you to convert a "low" CPU value into a target variable?

Spark has a wide variety of models that are available for classification modeling: https://spark.apache.org/docs/latest/mllib-classification-regression.html

If you are interested in seeing which factor is contributing to a specific instance, I would recommend starting with a logistic regression model as that will provide more explanatory power -- providing more insight into which factor is contributing to a particular CPU failure