Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

MapReduceIndexerTool fails with java.lang.OutOfMemoryError in QuickstartVM CDH 5.5.0

Highlighted

MapReduceIndexerTool fails with java.lang.OutOfMemoryError in QuickstartVM CDH 5.5.0

Explorer

Hi,

 

using the Quickstart VM with CDH 5.5.1, we tried to index a couple of PDF and DOCX files. Creating a collection went smoothly. Decided to use MapReduceIndexerTool for parsing the documents and indexing them.

 

The mapper syslog goes up to... Starting flush of map output ...

...
2016-03-01 04:07:28,821 WARN [main] org.apache.solr.core.SolrResourceLoader: Solr loaded a deprecated plugin/analysis class [solr.ThaiWordFilterFactory]. Please consult documentation how to replace it accordingly.
2016-03-01 04:07:28,934 INFO [main] org.apache.solr.schema.IndexSchema: unique key field: id
2016-03-01 04:07:29,157 INFO [main] org.apache.solr.schema.FileExchangeRateProvider: Reloading exchange rates from file currency.xml
2016-03-01 04:07:29,194 INFO [main] org.apache.solr.schema.FileExchangeRateProvider: Reloading exchange rates from file currency.xml
2016-03-01 04:07:30,176 INFO [main] org.kitesdk.morphline.api.MorphlineContext: Importing commands
2016-03-01 04:07:46,150 INFO [main] org.kitesdk.morphline.api.MorphlineContext: Done importing commands
2016-03-01 04:07:47,576 INFO [main] org.apache.solr.hadoop.morphline.MorphlineMapRunner: Processing file hdfs://quickstart.cloudera/user/cloudera/tamap.data/filename.docx
2016-03-01 04:07:52,287 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output

stderr contains:

 

Halting due to Out Of Memory Error...

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"

 

Tried many different memory setting in both YARN and SOlr, no success. Any help would be greatly apprecietad.

 

The output of the "hadoop jar" command says...

16/03/01 04:29:11 INFO mapreduce.Job: Task Id : attempt_1456833935429_0003_m_000000_2, Status : FAILED
Exception from container-launch.
Container id: container_1456833935429_0003_01_000004
Exit code: 1
Stack trace: ExitCodeException exitCode=1: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
	at org.apache.hadoop.util.Shell.run(Shell.java:460)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
	at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1

 

Thanks,

Slavo

2 REPLIES 2

Re: MapReduceIndexerTool fails with java.lang.OutOfMemoryError in QuickstartVM CDH 5.5.0

Expert Contributor

Try using something like this CLI option: 

 

-D mapreduce.map.java.opts="-Xmx2000m" 

Re: MapReduceIndexerTool fails with java.lang.OutOfMemoryError in QuickstartVM CDH 5.5.0

Explorer

Thanks, but already tried those options with different settings... Here is the command

hadoop jar /usr/lib/solr/contrib/mr/search-mr-1.0.0-cdh5.5.0-job.jar \
 org.apache.solr.hadoop.MapReduceIndexerTool \
 -D 'mapreduce.map.java.opts=-Xmx2048m' \
 -D 'mapreduce.reduce.java.opts=-Xmx2048m' \
 --mappers=1 \
 --reducers=1 \
 --morphline-file morphline.conf \
 --output-dir hdfs://quickstart.cloudera/user/cloudera/my.output/ \
 --zk-host quickstart.cloudera:2181/solr  \
 --collection mycollection \
 --go-live \
 --verbose \
 hdfs://quickstart.cloudera/user/cloudera/my.data/

Still the same error. My laptop has 16GB RAM, using 12GB for the VM.