Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.
Rising Star

An article on the challenges and solutions to predicting machine failures in the field.

The full details can be found here: https://github.com/kirkhas/zeppelin-notebooks/tree/master/Preventive_maintenance

Step #1 Feature Selection

64940-corrimg.png

Step #2 Geolocation

64941-map.png

Step #3 - Scythe is a time-series library authored by Kirk Haslbeck for these purposes

- Needed to Resample the data into trips or route segments (Scythe Resample)

- Needed to Step Interpolate the miles since last service to be 4K, 5K and less continuous regression

64942-time-series.png

Step #4

- Indexing and OneHotEncoding to the Rescue. Found a relationship of a particular "Make" that was more problematic than most.

64943-categorical.png

Roc Curve - A near perfect model

64944-roc.png

2,066 Views
Comments
Cloudera Employee

Nice work!

Cloudera Employee

Awesome work Kirk!

Super Guru

This is awesome.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎09-16-2022 01:42 AM
Updated by:
Contributors
Top Kudoed Authors