Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Effectively Implement a Cross Product Join in Hive

Highlighted

Effectively Implement a Cross Product Join in Hive

I have a views table with around one lakh records and a purchases table which can vary between 10k-15k.

My requirement is to have a consolidated table with all the entries of views table attached to each and every entry in the purchases table.

In other words, the new table should be having the primary key of purchases table + all entries of views table.

Cross product is the worst thing to have in hive. But since its more like a priority for me, is there any suggestions/tunings to speed up the process? which i can use as a check list?

Note: When i executed it utilising Map Join, it was stuck for ever at the last 2 mappers

Don't have an account?
Coming from Hortonworks? Activate your account here