Support Questions

Find answers, ask questions, and share your expertise

OutofMemory with GetmongoDB

avatar
Rising Star

Hi,

I try to get 2,500,000 records from a MongoDB collection with

1. these parameters in bootstrap.conf

# JVM memory settings

java.arg.2=-Xms6144m

java.arg.3=-Xmx6144m

2.the following properties processor

SSL Context Service No value set

Client Auth NONE

Query No value set
Projection No value set
Sort No value set
Limit No value set
Batch Size No value set
and with 2 concurrent tasks scheduling
I've tried wtih 4Go, then 6Go for the memory settings. Nifi failed with OutofMemory errors

With 4Go - Starting GetMongo processor at 09:05 and Error at 09:24

2017-05-19 09:24:40,150 ERROR [Timer-Driven Process Thread-4] o.a.nifi.processors.mongodb.GetMongo GetMongo[id=4ee5171c-1006-115b-5dc0-6ef54c1e9a73] GetMongo[id=4ee5171c-1006-115b-5dc0-6ef54c1e9a73] failed to process due to java.lang.OutOfMemoryError: Java heap space; rolling back session: java.lang.OutOfMemoryError: Java heap space

2017-05-19 09:24:40,167 ERROR [Timer-Driven Process Thread-4] o.a.nifi.processors.mongodb.GetMongo java.lang.OutOfMemoryError : Java heap space

With 6Go - Starting GetMongo processor at 09:42 and Error at 10:28

2017-05-19 10:28:50,336 ERROR [NiFi logging handler] org.apache.nifi.StdErr

2017-05-19 10:28:50,337 ERROR [NiFi logging handler] org.apache.nifi.StdErr Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-2-thread-1"

Do you have any suggestions to help me ?

Thanks

1 ACCEPTED SOLUTION

avatar
@Thierry Vernhet

Try putting a larger range between the minimum and maximum group size, like 25 MB and 50 MB.

How much memory have you allocated to the NiFi JVM? The default is 512MB, it is set in the bootsrrap.conf file.

# JVM memory settings

java.arg.2=-Xms512m, change to 2g or 4g if you have the memory available on your system

java.arg.3=-Xmx512m, change to 2g or 4g if you have the memory available on your system

Since you are dealing with 1.3 GB in the MergeContent processor, make sure to at least allocate double that for the NiFi JVM, because the MergeContent processor uses the JVM memory to build it merged flow files. In addition I would set the number of Concurrent Tasks to 3.

View solution in original post

8 REPLIES 8

avatar

@Thierry Vernhet

Try setting the Batch Size property to 1000, and see if that helps.

avatar
Rising Star

I've set it to 100 or 1000 or 2000. After about 10 minutes, the processor reads with success all the collection but in one shot and whatever the value of the propoerty. Is it normal ?

avatar
@Thierry Vernhet

So, you are saying it works with Batch Size set, but it does not matter what you set the value to?

avatar
Rising Star

Exactlty

The queue after GetMongo has 2,500,000,evt (1,3GB). The following processor (MergeContent) can not empty this queue.

And I don't understand why ?

avatar
@Thierry Vernhet

How are the properties set in the MergeContent processor?

avatar
Rising Star

@Wynner

Thanks for your feddback,

Here are the properties (I've also tried with 200MB before but it doesn't work).

Merge Strategy Bin-Packing Algorithm

Merge Format Binary Concatenation

Attribute Strategy Keep Only Common Attributes

Correlation Attribute Name No value set

Minimum Number of Entries 1

Maximum Number of Entries No value set

Minimum Group Size 20 MB

Maximum Group Size 20 MB

Max Bin Age 5 min

Maximum number of Bins 100

Delimiter Strategy Text

Header No value set

Footer No value set

Demarcator

Compression Level 1

Keep Path false

After an hour Nifi fails with a outOfmemory

2017-05-22 12:35:44,770 WARN [NiFi Web Server-22-acceptor-0@2c439296-ServerConnector@ccf1486{HTTP/1.1,[http/1.1]}{0.0.0.0:28080}] o.eclipse.jetty.server.AbstractConnector java.lang.OutOfMemoryError: Java heap space 2017-05-22 12:35:44,784 WARN [NiFi Web Server-21] org.eclipse.jetty.servlet.ServletHandler Error for /nifi-api/flow/controller/bulletins java.lang.OutOfMemoryError: Java heap space at java.lang.StringBuilder.toString(StringBuilder.java:407) ~[na:1.8.0_66] at java.net.Inet4Address.numericToTextFormat(Inet4Address.java:373) ~[na:1.8.0_66] at java.net.Inet4Address.getHostAddress(Inet4Address.java:328) ~[na:1.8.0_66] at org.eclipse.jetty.server.Request.getRemoteAddr(Request.java:1193) ~[na:na] at javax.servlet.ServletRequestWrapper.getRemoteAddr(ServletRequestWrapper.java:275) ~[javax.servlet-api-3.1.0.jar:3.1.0] at org.apache.nifi.web.filter.RequestLogger.doFilter(RequestLogger.java:62) ~[classes/:na] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1676) ~[na:na] at org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:316) ~[spring-security-web-4.0.3.RELEASE.jar:4.0.3.RELEASE]

avatar
@Thierry Vernhet

Try putting a larger range between the minimum and maximum group size, like 25 MB and 50 MB.

How much memory have you allocated to the NiFi JVM? The default is 512MB, it is set in the bootsrrap.conf file.

# JVM memory settings

java.arg.2=-Xms512m, change to 2g or 4g if you have the memory available on your system

java.arg.3=-Xmx512m, change to 2g or 4g if you have the memory available on your system

Since you are dealing with 1.3 GB in the MergeContent processor, make sure to at least allocate double that for the NiFi JVM, because the MergeContent processor uses the JVM memory to build it merged flow files. In addition I would set the number of Concurrent Tasks to 3.

avatar
Rising Star

@Wynner

Hi,

Memory settings already to 8Go for both.

This morning I tried to set Number min and max of Entries and it works !!!

Thanks for your help Wynner.

Regards

Parameters set (so the merge create about 1000 files)

Minimum Number of Entries

1

Maximum Number of Entries

2500

Minimum Group Size

0 B

Maximum Group Size No value set Max Bin Age

5 min

Maximum number of Bins

100

Delimiter Strategy

Text