I am new to Nifi and using it to automate the process where I have to read xml files from one folder, make some changes o those files and send those files to a different folder. Each file is at least 2GB. I have a simple Nifi setup for this. Below is the script where I just read file from sys.stdin input and remove linespaces and then send this file to the next folder.
import sys ff=sys.stdin.readline() for line in ff: if not line.isspace(): sys.stdout.write(line)
When I run this script in ExecuteStreamCommand processor, I don't get anything in the output folder. When i click on View under the contents, I see Out Of memory error. Below is a snippet from the logs.
2020-10-29 05:59:07,320 INFO [NiFi Web Server-29] o.a.n.c.queue.AbstractFlowFileQueue Canceling ListFlowFile Request with ID 73cbef19-0175-1000-00b4-d60951dda790 2020-10-29 05:59:20,213 WARN [NiFi Web Server-29] org.eclipse.jetty.server.HttpChannel /nifi-content-viewer/ java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source) at java.lang.AbstractStringBuilder.append(Unknown Source) at java.lang.StringBuilder.append(Unknown Source) at org.apache.commons.io.output.StringBuilderWriter.write(StringBuilderWriter.java:142) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2538) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2516) at org.apache.commons.io.IOUtils.copy(IOUtils.java:2493) at org.apache.commons.io.IOUtils.copy(IOUtils.java:2441)
I have tried to change the heap settings in Bootstrap from 512 to 2g and 4g. But in that case Nifi doesn't even start. It works only when both the values are 1024m. Below is the snippet from boostrap settings.
Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -XX:+UseG1GC -Djava.protocol.handler.pkgs=sun.net.www.protocol -Dnifi.properties.file.path=C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\.\conf\nifi.properties -Dnifi.bootstrap.listen.port=57979 -Dapp=NiFi -Dorg.apache.nifi.bootstrap.config.log.dir=C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\bin\..\\logs org.apache.nifi.NiFi 2020-10-29 06:09:39,521 INFO [NiFi logging handler] org.apache.nifi.StdOut Error occurred during initialization of VM 2020-10-29 06:09:39,522 INFO [NiFi logging handler] org.apache.nifi.StdOut Could not reserve enough space for 2097152KB object heap 2020-10-29 06:09:39,765 WARN [main] org.apache.nifi.bootstrap.Command Failed to set permissions so that only the owner can read pid file C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\bin\..\run\nifi.pid; this may allows others to have access to the key needed to communicate with NiFi. Permissions should be changed so that only the owner can read this file
Under system diagnostics in Nifi, it reads Heap(62%).
I would really appreciate if someone can assist me with resolving this issue. How do i proceed?
@Kaur it appears like your nifi node does not have enough system ram to allow you to use 2g and 4g settings. I suggest increasing the node specification to at least 8gb or 16 gb of system ram and test boostrap config with 2g 4g or 4g 8g respectively.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.