An article on the challenges and solutions to predicting machine failures in the field.
The full details can be found here: https://github.com/kirkhas/zeppelin-notebooks/tree/master/Preventive_maintenance
Step #1 Feature Selection
Step #2 Geolocation
Step #3 - Scythe is a time-series library authored by Kirk Haslbeck for these purposes
- Needed to Resample the data into trips or route segments (Scythe Resample)
- Needed to Step Interpolate the miles since last service to be 4K, 5K and less continuous regression
- Indexing and OneHotEncoding to the Rescue. Found a relationship of a particular "Make" that was more problematic than most.
Roc Curve - A near perfect model
Awesome work Kirk!
This is awesome.