Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Which predictive modelling we can use for predicting Terra bytes of Data which is in Hive

Which predictive modelling we can use for predicting Terra bytes of Data which is in Hive

Hi

I have a data set of 100 millions of records

timestamp,hostname,country,cpu/memory,metric value

2017-12-01 06:35:57.0wkliunhjjlcpu

metric value -1

I need to predict which hostname is using max of CPU.

which prediction model i can use, which tool or technique? can anyone suggest. thanks

 
1 REPLY 1

Re: Which predictive modelling we can use for predicting Terra bytes of Data which is in Hive

Contributor

The model you choose will be based on the number of labels (hostname) you are trying to predict on. Assuming your feature set will be timestamp, cpu/memory metric, country,.. You can start with something simple like KNN that is trained using Spark.