Support Questions
Find answers, ask questions, and share your expertise

Did You Know ... About Secondary Sorts?

Cloudera Employee

Secondary sorts are a way to group data together in a reduce.  If you're finding you're having to buffer data in your reducer like in this example, you should be using a secondary sort.  Buffering data when you're dealing with Big Data is a recipe for an OutOfMemoryException.  Here's a full example showing a secondary sort on playing cards.