Expert Contributor
Posts: 133
Registered: ‎01-08-2018
Re: MapReduceIndexerTool problem

I use the same command and have no issues.

According to logs:

Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1523546159827_0013_r_000000_0/map_0.out

So, I would guess that you csv is too big and when the reducer tries to load it, there is no sufficient space in local dirs of YARN nodemanager.

 

Can you try set more reducers by using :

--reducers 4

or more (based on your partitions and the csv size). You can also set more mappers, but based on log the reducer is suffering.

More details:

https://www.cloudera.com/documentation/enterprise/5-13-x/topics/search_mapreduceindexertool.html#con...

View solution in original post

Who Me Too'd this solution