Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Implementing Join with Hadoop Map-Reduce


Implementing Join with Hadoop Map-Reduce




We have multiple deliminated files from different source systems that we need to merge into one one impala table based on attributes in each file.


we are constantly running into memory errors while tyring to do the merge via HiveQL by creating dataframes of each file.


Each file cotain millions of rows.


Is mapreduce a viable solution for this situation


Are there any examples of how to handle these kind of situations.



Don't have an account?
Coming from Hortonworks? Activate your account here