Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Who agreed with this solution

avatar
Super Collaborator

I use the same command and have no issues.

According to logs:

Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1523546159827_0013_r_000000_0/map_0.out

So, I would guess that you csv is too big and when the reducer tries to load it, there is no sufficient space in local dirs of YARN nodemanager.

 

Can you try set more reducers by using :

--reducers 4

or more (based on your partitions and the csv size). You can also set more mappers, but based on log the reducer is suffering.

More details:

https://www.cloudera.com/documentation/enterprise/5-13-x/topics/search_mapreduceindexertool.html#con...

View solution in original post

Who agreed with this solution