Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Estimate most important dimensions in each cluster after performing k-means

Estimate most important dimensions in each cluster after performing k-means

New Contributor

How to estimate the most important features in each cluster after the application of the k-means clustering algorithm? I need to cluster the customers of retail shops based on the products that they purchased. Therefore, I need to obtain, as results, both the customers belonging to each cluster and in each cluster the products that mostly influence the specified cluster (i.e. in the cluster A, among all products, the customers purchase meat, bread, milk, ecc...). I'm going to use Apache Spark Mllib.