Created 08-29-2016 12:03 PM
Hi experts,
I've the following dataset (just a example):
Customer_ID | Product_Desc |
1 | Jeans |
1 | T-Shirt |
1 | Food |
2 | Jeans |
2 | Food |
2 | Nightdress |
2 | T-Shirt |
2 | Hat |
3 | Jeans |
3 | Food |
4 | Food |
4 | Water |
5 | Water |
5 | Food |
5 | Beer |
There exists any algorithm available that allows me to predictive Consumer Behavior like this: "When a customer buy a Jeans it also buys Food together"
The algorithms that I've found only calculate the most common products...not the association between them :( Anyone knows a good tutorial that shows me how can I predict the association that I said above?
The first step is to conclude this relationships:
Jeans-T-Shirt-Food |
Jeans-Food-Nightdress-T-Shirt-Hat |
Jeans-Food |
Food-Water |
Water-Food-Beer |
Anyone have an Idea?
Many thanks!!!Created 08-29-2016 01:18 PM
It sounds like you're looking for collaborative filtering, which does exist in spark.mllib: http://spark.apache.org/docs/1.6.2/mllib-collaborative-filtering.html
Amazon.com published a paper in 2003: "Amazon.com Recommendations: Item-to-Item Collaborative Filtering" which describes the algorithm in more detail. Quoting from the paper:
"the algorithm finds items similar to each of the user’s purchases and ratings, aggregates those items, and then recommends the most popular or correlated items".
Created 08-29-2016 01:28 PM
Alex Woolford many thanks for your help :) In your case how can plan this project as a machine learning project? What I'm seeing is that the algorithms that I've been seen only count the occurrences. Thanks!