Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this solution

avatar
Super Collaborator

I use the same command and have no issues.

According to logs:

Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1523546159827_0013_r_000000_0/map_0.out

So, I would guess that you csv is too big and when the reducer tries to load it, there is no sufficient space in local dirs of YARN nodemanager.

 

Can you try set more reducers by using :

--reducers 4

or more (based on your partitions and the csv size). You can also set more mappers, but based on log the reducer is suffering.

More details:

https://www.cloudera.com/documentation/enterprise/5-13-x/topics/search_mapreduceindexertool.html#con...

View solution in original post

Who agreed with this solution