<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: MorphlineSolrSink GC overhead limit Exceeded in Flume Sink in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35494#M13318</link>
    <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hello&amp;nbsp;pdvorak!!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much. &amp;nbsp;i don't know why I was trying to use the memory parameters with a -Dproperty value. &amp;nbsp;(brain freeze I guess)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;using your suggestion was great and it worked perfectly!. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again!!, Kind Regards&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;rafa&lt;/P&gt;</description>
    <pubDate>Tue, 22 Dec 2015 21:29:03 GMT</pubDate>
    <dc:creator>rilarios</dc:creator>
    <dc:date>2015-12-22T21:29:03Z</dc:date>
    <item>
      <title>MorphlineSolrSink GC overhead limit Exceeded in Flume Sink</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35477#M13316</link>
      <description>&lt;P&gt;Good Morning everyone&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I've been trying to get the syslog-&amp;gt;Solr example specified&lt;/SPAN&gt; &lt;A href="http://cloudera.github.io/cdk/docs/current/cdk-morphlines/index.html" target="_self"&gt;here&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;. &amp;nbsp;It is a fairly simple example using:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;syslog source&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;memory channel&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;MorphlineSolr Sink&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The flume configuration file I used is:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;a1.sources = r1
a1.channels = c1
a1.sinks = k1

#source
a1.sources.r1.type = syslogtcp
a1.sources.r1.port = xxxx
a1.sources.r1.host = xxx.xxx.xxx.xxx

#sink
a1.sinks.k1.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
a1.sinks.k1.morphlineFile = /route/to/the/morphline.conf

#channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
#a1.channels.c1.transactionCapacity = 10000

#connect
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;The morphline files I used was the same provided in the example above:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;morphlines : [
  {
    id : morphline1

    importCommands : ["com.cloudera.**", "org.kitesdk.**", "org.apache.solr.**"]

    commands : [
      {
        readLine {
          charset : UTF-8
        }
      }

      {
        grok {
          dictionaryFiles : [/route/to/the/morphline/grok-dictonaries]
          expressions : {
            message : """&amp;lt;%{POSINT:priority}&amp;gt;%{SYSLOGTIMESTAMP:timestamp} %{SYSLOGHOST:hostname} %{DATA&amp;amp;colon;program}(?:\[%{POSINT:pid}\])?: %{GREEDYDATA&amp;amp;colon;msg}"""
          }
        }
      }


      {
        convertTimestamp {
          field : timestamp
          inputFormats : [ "yyyy-MM-dd'T'HH:mm:ss'Z'", "MMM d HH:mm:ss" ]
          inputTimezone : America/Bogota
          outputFormat : "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"
          outputTimezone : UTC
        }
      }

      {
        sanitizeUnknownSolrFields {
          # Location from which to fetch Solr schema
          solrLocator: {
            collection: syslogs
            zkHost: "zkEnsembleAddresses"
          }
        }
      }

      # log the record at INFO level to SLF4J
      { logInfo { format : "output record: {}", args : ["@{}"] } }

      {
        loadSolr {
          solrLocator : {
            collection: syslogs
            zkHost: "zkEnsembleAddresses"
          }
        }
      }
    ]
  }
]&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;The morphlines file specify the folowing chain of commands:&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;- read the line&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;- grok messages with an expression&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;- convert the timestamp&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;- filter unsanitized&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;fields&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;- put the data in solr&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;However when I try to start the flume agent it always throws the error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;java.lang.OutOfMemoryError: GC overhead limit exceeded&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;it always shows the same error after the message:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;INFO api.MorphlineContext: Importing commands&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;flume never gets to start the solr sink. &amp;nbsp; It seems that there is not enough memory in flume to start the sink. &amp;nbsp;So I&amp;nbsp;modified the&amp;nbsp;&lt;STRONG&gt;&lt;FONT face="courier new,courier"&gt;/etc/flume-ng/conf/flume-env.sh&lt;/FONT&gt;&lt;/STRONG&gt; file and uncommented the JAVA_OPTS LINE. &amp;nbsp;the uncommented line was this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;export JAVA_OPTS="-Xms2048m -Xmx204800m -Dcom.sun.management.jmxremote"&lt;/PRE&gt;&lt;P&gt;Basically I was giving 2GB of starting heap space to java (a maximum limit of 200GB - the machines in the cluster have a lot of Memory). &amp;nbsp;The error is still the same :(.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then I modified the command line to start the flume agent trying to increase the java memory:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;flume-ng agent -n a1 -f flume_config.conf -Dproperty="-Xms1024m -Xmx=204800m"&lt;/PRE&gt;&lt;P&gt;And the error still keeps appearing. &amp;nbsp;I don't really know if I am not giving the correct memory options or in the places that I should, but this problem is getting me (more) bald !.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any pointers would be very much appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for your support&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Rafa&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PS.., just in case this is the stacktrace of the error:&lt;/P&gt;&lt;PRE&gt;15/12/22 10:53:51 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
15/12/22 10:53:51 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:flume_config_e.conf
15/12/22 10:53:51 INFO conf.FlumeConfiguration: Added sinks: k1 Agent: a1
15/12/22 10:53:51 INFO conf.FlumeConfiguration: Processing:k1
15/12/22 10:53:51 INFO conf.FlumeConfiguration: Processing:k1
15/12/22 10:53:51 INFO conf.FlumeConfiguration: Processing:k1
15/12/22 10:53:51 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
15/12/22 10:53:51 INFO node.AbstractConfigurationProvider: Creating channels
15/12/22 10:53:51 INFO channel.DefaultChannelFactory: Creating instance of channel c1 type memory
15/12/22 10:53:51 INFO node.AbstractConfigurationProvider: Created channel c1
15/12/22 10:53:51 INFO source.DefaultSourceFactory: Creating instance of source r1, type exec
15/12/22 10:53:51 INFO sink.DefaultSinkFactory: Creating instance of sink: k1, type: org.apache.flume.sink.solr.morphline.MorphlineSolrSink
15/12/22 10:53:51 INFO node.AbstractConfigurationProvider: Channel c1 connected to [r1, k1]
15/12/22 10:53:51 INFO node.Application: Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3927ce5e counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
15/12/22 10:53:51 INFO node.Application: Starting Channel c1
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: c1 started
15/12/22 10:53:51 INFO node.Application: Starting Sink k1
15/12/22 10:53:51 INFO morphline.MorphlineSink: Starting Morphline Sink k1 (MorphlineSolrSink) ...
15/12/22 10:53:51 INFO node.Application: Starting Source r1
15/12/22 10:53:51 INFO source.ExecSource: Exec source starting with command:tail -f /var/logs/flume-ng/flume-cmf-flume-AGENT-sbmdeqpc01.ambientesbc.lab.log
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: k1: Successfully registered new MBean.
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 started
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean.
15/12/22 10:53:51 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: r1 started
15/12/22 10:53:51 INFO source.ExecSource: Command [tail -f /var/logs/flume-ng/flume-cmf-flume-AGENT-sbmdeqpc01.ambientesbc.lab.log] exited with 1
15/12/22 10:53:51 INFO api.MorphlineContext: Importing commands
15/12/22 10:53:55 ERROR lifecycle.LifecycleSupervisor: Unable to start SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@3927ce5e counterGroup:{ name:null counters:{} } } - Exception follows.
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.lang.String.replace(String.java:2021)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath.getClassName(ClassPath.java:403)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath$ClassInfo.&amp;lt;init&amp;gt;(ClassPath.java:193)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath$ResourceInfo.of(ClassPath.java:141)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath$Scanner.scanJar(ClassPath.java:345)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath$Scanner.scanFrom(ClassPath.java:286)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath$Scanner.scan(ClassPath.java:274)
        at org.kitesdk.morphline.shaded.com.google.common.reflect.ClassPath.from(ClassPath.java:82)
        at org.kitesdk.morphline.api.MorphlineContext.getTopLevelClasses(MorphlineContext.java:149)
        at org.kitesdk.morphline.api.MorphlineContext.importCommandBuilders(MorphlineContext.java:91)
        at org.kitesdk.morphline.stdlib.Pipe.&amp;lt;init&amp;gt;(Pipe.java:43)
        at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40)
        at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:126)
        at org.kitesdk.morphline.base.Compiler.compile(Compiler.java:55)
        at org.apache.flume.sink.solr.morphline.MorphlineHandlerImpl.configure(MorphlineHandlerImpl.java:101)
        at org.apache.flume.sink.solr.morphline.MorphlineSink.start(MorphlineSink.java:97)
        at org.apache.flume.sink.DefaultSinkProcessor.start(DefaultSinkProcessor.java:46)
        at org.apache.flume.SinkRunner.start(SinkRunner.java:79)
        at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
15/12/22 10:53:55 INFO morphline.MorphlineSink: Morphline Sink k1 stopping...
15/12/22 10:53:55 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: k1 stopped&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:54:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35477#M13316</guid>
      <dc:creator>rilarios</dc:creator>
      <dc:date>2022-09-16T09:54:31Z</dc:date>
    </item>
    <item>
      <title>Re: MorphlineSolrSink GC overhead limit Exceeded in Flume Sink</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35483#M13317</link>
      <description>&lt;P&gt;Are you using Cloudera Manager to start the flume agent? &amp;nbsp;If so, you'll want to configure the heap size through Cloudera Manager. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you are not using Cloudera Manager, you will want to specify the following on the command line, not as a -Dproperty value:&lt;/P&gt;&lt;PRE&gt;flume-ng agent -n a1 -f flume_config.conf -Xms1024m -Xmx=204800m&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We recommend to set the -Xms and -Xmx values to the same amount so the jvm does not have to resize the heap which can cause performance issues.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Dec 2015 17:23:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35483#M13317</guid>
      <dc:creator>pdvorak</dc:creator>
      <dc:date>2015-12-22T17:23:16Z</dc:date>
    </item>
    <item>
      <title>Re: MorphlineSolrSink GC overhead limit Exceeded in Flume Sink</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35494#M13318</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hello&amp;nbsp;pdvorak!!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you very much. &amp;nbsp;i don't know why I was trying to use the memory parameters with a -Dproperty value. &amp;nbsp;(brain freeze I guess)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;using your suggestion was great and it worked perfectly!. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks again!!, Kind Regards&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;rafa&lt;/P&gt;</description>
      <pubDate>Tue, 22 Dec 2015 21:29:03 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/35494#M13318</guid>
      <dc:creator>rilarios</dc:creator>
      <dc:date>2015-12-22T21:29:03Z</dc:date>
    </item>
    <item>
      <title>Re: MorphlineSolrSink GC overhead limit Exceeded in Flume Sink</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/80507#M13319</link>
      <description>&lt;PRE&gt;flume-ng agent -n a1 -f flume_config.conf -Xms1024m -Xmx204800m&lt;/PRE&gt;&lt;P&gt;There shouldn't be an equals sign after -Xmx&lt;/P&gt;</description>
      <pubDate>Mon, 01 Oct 2018 17:08:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MorphlineSolrSink-GC-overhead-limit-Exceeded-in-Flume-Sink/m-p/80507#M13319</guid>
      <dc:creator>angoothachap</dc:creator>
      <dc:date>2018-10-01T17:08:41Z</dc:date>
    </item>
  </channel>
</rss>

