Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Indexing data to Solr with a MapReduce and Morphline. OutofMemory.

Solved Go to solution
Highlighted

Indexing data to Solr with a MapReduce and Morphline. OutofMemory.

New Contributor

Hello, 

 

I'm trying index some data in Solr with morphline. It executes one map and two reduces, failing the reducers. When I execute the same command with less data, it works. It seems a problem of memory but I increased the memory of Solr (2 nodes) to 64gb each one and it doesn't work. I'm using CDH 5.4.3. Should I increase the memory of Solr? or the problem is another thing? Any clue about it?.

 

 

hadoop jar /opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/jars/search-mr-1.0.0-cdh5.4.2-job.jar org.apache.solr.hadoop.MapReduceIndexerTool -D 'mapred.child.java.opts=-Xmx4096m' --morphline-file /home/user/metadata/morphline_test.conf --output-dir metadata/output_test7 --zk-host xxxx:2181/solr --collection metadata --go-live /user/hive/warehouse/metadata.db/full_metadata 

 

 

Error: java.io.IOException: Batch Write Failure at org.apache.solr.hadoop.BatchWriter.throwIf(BatchWriter.java:239) at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:181) at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:290) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.solr.common.SolrException: Exception writing document id 5c88eaf5-7352-40bf-9e9a-8358dfb11280 to the index; possible analysis error. at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:168) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:926) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1081) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:692) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:99) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1984) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54) at org.apache.solr.hadoop.BatchWriter.runUpdate(BatchWriter.java:135) at org.apache.solr.hadoop.BatchWriter$Batch.run(BatchWriter.java:90) at org.apache.solr.hadoop.BatchWriter.queueBatch(BatchWriter.java:180) ... 9 more Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:698) at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:712) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164) ... 28 more Caused by: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.store.RAMFile.newBuffer(RAMFile.java:77) at org.apache.lucene.store.RAMFile.addBuffer(RAMFile.java:50) at org.apache.lucene.store.RAMOutputStream.switchCurrentBuffer(RAMOutputStream.java:154) at org.apache.lucene.store.RAMOutputStream.writeBytes(RAMOutputStream.java:140) at org.apache.lucene.index.PrefixCodedTerms$Builder.add(PrefixCodedTerms.java:120) at org.apache.lucene.index.FrozenBufferedUpdates.<init>(FrozenBufferedUpdates.java:74) at org.apache.lucene.index.DocumentsWriterDeleteQueue.freezeGlobalBuffer(DocumentsWriterDeleteQueue.java:233) at org.apache.lucene.index.DocumentsWriterPerThread.prepareFlush(DocumentsWriterPerThread.java:390) at org.apache.lucene.index.DocumentsWriterFlushQueue.addFlushTicket(DocumentsWriterFlushQueue.java:70) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:507) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:624) at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2949) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3104) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3071) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:582) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Indexing data to Solr with a MapReduce and Morphline. OutofMemory.

Expert Contributor
On yarn the params are called mapreduce.map.java.opts and mapreduce.reduce.java.opts.

Wolfgang.

View solution in original post

1 REPLY 1

Re: Indexing data to Solr with a MapReduce and Morphline. OutofMemory.

Expert Contributor
On yarn the params are called mapreduce.map.java.opts and mapreduce.reduce.java.opts.

Wolfgang.

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here