Support Questions

Find answers, ask questions, and share your expertise

Writing a Map reduce code with larger and smaller file

avatar
Contributor

I have a large file of 5 GB which has detailed information about an Employee and also, i have 1 small file with 2 MB which has only employee names. I want to extract the employee names from the smaller file and do analysis on larger file using employee name. How can I do this in Map reduce ?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

---> Large file can be input to your Map Reduce program

----> Small file can be passed in distributed cache and can be loaded in List

----> Inside your mapper function, you can do comparisons(Input file vs List) or any other operation which you want.

Let me know if you need help in MR job.

Please post sample data for files and operation you want to perform.

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

---> Large file can be input to your Map Reduce program

----> Small file can be passed in distributed cache and can be loaded in List

----> Inside your mapper function, you can do comparisons(Input file vs List) or any other operation which you want.

Let me know if you need help in MR job.

Please post sample data for files and operation you want to perform.

avatar
Expert Contributor

@Rakesh AN If above information helped you, Could you please accept answer?