Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

"Diapers and Beer" project using Spark in Sandbox

Solved Go to solution

"Diapers and Beer" project using Spark in Sandbox

Explorer

Hi, There exists any tutorial/white paper available that explains how to implement the preditive model used to predict the story og "Diapers and Beer" on retail industry under Spark Mllib? I need to predict some values in retail industry using Spark and I don't have any reference/tutorial for this. And the project of "Diapers and Beer" have great similarities with my project and I would like to see how the project was implemented (like the code and the data model). Many thanks :)

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: "Diapers and Beer" project using Spark in Sandbox

Hi @Johnny Fugers,

What you are talking about sounds like you are trying to build a recommender system. I assume what you are trying to do is understand that users that do X and Y also tend to do Z. From your example, customers that buy diapers and beer also tend to buy milk. Within HDP, you would be looking at Apache Spark and the machine learning capabilities that it offers. There is a clear example of leveraging Collaborative Filtering for this type of problem on the Spark site. You should be able to run this directly on your HDP sandbox or cluster.

View solution in original post

2 REPLIES 2
Highlighted

Re: "Diapers and Beer" project using Spark in Sandbox

Hi @Johnny Fugers,

What you are talking about sounds like you are trying to build a recommender system. I assume what you are trying to do is understand that users that do X and Y also tend to do Z. From your example, customers that buy diapers and beer also tend to buy milk. Within HDP, you would be looking at Apache Spark and the machine learning capabilities that it offers. There is a clear example of leveraging Collaborative Filtering for this type of problem on the Spark site. You should be able to run this directly on your HDP sandbox or cluster.

View solution in original post

Highlighted

Re: "Diapers and Beer" project using Spark in Sandbox

+1 on Recommender system. A more concrete example is "Building a Movie Recommendation Service with Apache Spark" below that walks you through an example.

https://www.codementor.io/spark/tutorial/building-a-recommender-with-apache-spark-python-example-app...

Don't have an account?
Coming from Hortonworks? Activate your account here