Reply
Highlighted
Contributor
Posts: 25
Registered: ‎02-11-2019

Implementing Join with Hadoop Map-Reduce

[ Edited ]

Hi,

 

We have multiple deliminated files from different source systems that we need to merge into one one impala table based on attributes in each file.

 

we are constantly running into memory errors while tyring to do the merge via HiveQL by creating dataframes of each file.

 

Each file cotain millions of rows.

 

Is mapreduce a viable solution for this situation

 

Are there any examples of how to handle these kind of situations.

 

Thanks

Announcements