Member since
07-15-2014
2
Posts
0
Kudos Received
0
Solutions
08-05-2014
11:30 PM
Thank you, Wolfgang. The cause of the problem was that the amount of Java Heap Memory dedicated for a MR container was too low, though there was enough phisycal memory on each node. After some tuning, I found out that the ratio of memory of data (unzipped) to index to available JVM memory for a reducer task is about 2:1. So if you have a larger file to index in single reducer you likely get OutOfMemeryError. Thanks for the help.
... View more
07-15-2014
11:54 PM
Hello, colleagues. I've some problems with MapReduceIndexerTool that raises OOM error in Mapper phase on not a large data size. We have cluster of 8 servers, 15 Gb of RAM and 0.5TB of disk. Before running the MapReduceIndexerTool in the Clouder Manager I may see that each server has ~12 GB of free RAM and >200GB of disk size. The version of Haddop is: Cloudera Express 5.0.0 (#215 built by jenkins on 20140331-1424 git: 50c701f3e920b1fcf524bf5fa061d65902cde804) I have a list of avro files (32), each up to 320 Mb in size. Indeed I sqooped them from an Oracle table with --compression-codec snappy and in splits. Each file has ~10 millions of records. So, when I run the MapReduceIndexerTool, the most of the mapper tasks fail with OutOfMemoryError, the error stack follows. On some files that are smaller in size (they are not even in size, but maximum is 320 Mb) there is no error. I googled a bit and found a suggestion to try increasing dedicated java heap memory -D 'mapred.child.java.opts=-Xmx2G' (https://groups.google.com/a/cloudera.org/forum/#!topic/search-user/5k9apj7FSiY), but no success. 320 Mb doesn't seem too large for Hadoop, isn't it? Any advice would be appreciated. The command: -------------- sudo -u hdfs hadoop jar /usr/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.HdfsFindTool -find \ hdfs://$NNHOST:8020//user/root/solrindir/tmlogavro -type f \ -name 'part*.avro' |\ sudo -u hdfs hadoop --config /etc/hadoop/conf.cloudera.yarn \ jar /usr/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \ --libjars /usr/lib/solr/contrib/mr/search-mr-1.0.0-cdh5.0.0.jar \ -D 'mapred.child.java.opts=-Xmx2G' \ --log4j /var/lib/hadoop-hdfs/solr_configs_for_tm_log_morphlines/log4j.properties \ --morphline-file /var/lib/hadoop-hdfs/solr_configs_for_tm_log_morphlines/morphlines.conf \ --output-dir hdfs://$NNHOST:8020/user/$USER/solroutdir \ --update-conflict-resolver org.apache.solr.hadoop.dedup.RetainMostRecentUpdateConflictResolver \ --verbose --go-live --zk-host $ZKHOST \ --collection tm_log_avro \ --shards 32 --input-list - The error I see in each failed mapper task: ----------- Error: java.io.IOException: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:334) at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.solr.client.solrj.SolrServerException: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:223) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.commit(SolrServer.java:168) at org.apache.solr.hadoop.BatchWriter.close(BatchWriter.java:199) at org.apache.solr.hadoop.SolrRecordWriter.close(SolrRecordWriter.java:322) ... 8 more Caused by: org.apache.solr.client.solrj.SolrServerException: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:155) ... 12 more Caused by: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2726) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2897) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2872) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:550) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:95) at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1256) at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1233) at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:157) at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1947) at org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150) ... 12 more Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Solr
-
Apache YARN
-
HDFS
-
Security