Hello,
I am new to Nifi and using it to automate the process where I have to read xml files from one folder, make some changes o those files and send those files to a different folder. Each file is at least 2GB. I have a simple Nifi setup for this. Below is the script where I just read file from sys.stdin input and remove linespaces and then send this file to the next folder.
import sys
ff=sys.stdin.readline()
for line in ff:
if not line.isspace():
sys.stdout.write(line)
When I run this script in ExecuteStreamCommand processor, I don't get anything in the output folder. When i click on View under the contents, I see Out Of memory error. Below is a snippet from the logs.
2020-10-29 05:59:07,320 INFO [NiFi Web Server-29] o.a.n.c.queue.AbstractFlowFileQueue Canceling ListFlowFile Request with ID 73cbef19-0175-1000-00b4-d60951dda790
2020-10-29 05:59:20,213 WARN [NiFi Web Server-29] org.eclipse.jetty.server.HttpChannel /nifi-content-viewer/
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at org.apache.commons.io.output.StringBuilderWriter.write(StringBuilderWriter.java:142)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2538)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2516)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2493)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:2441)
I have tried to change the heap settings in Bootstrap from 512 to 2g and 4g. But in that case Nifi doesn't even start. It works only when both the values are 1024m. Below is the snippet from boostrap settings.
Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -XX:+UseG1GC -Djava.protocol.handler.pkgs=sun.net.www.protocol -Dnifi.properties.file.path=C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\.\conf\nifi.properties -Dnifi.bootstrap.listen.port=57979 -Dapp=NiFi -Dorg.apache.nifi.bootstrap.config.log.dir=C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\bin\..\\logs org.apache.nifi.NiFi
2020-10-29 06:09:39,521 INFO [NiFi logging handler] org.apache.nifi.StdOut Error occurred during initialization of VM
2020-10-29 06:09:39,522 INFO [NiFi logging handler] org.apache.nifi.StdOut Could not reserve enough space for 2097152KB object heap
2020-10-29 06:09:39,765 WARN [main] org.apache.nifi.bootstrap.Command Failed to set permissions so that only the owner can read pid file C:\Users\bawag\Desktop\NIFI-1~1.2-B\NIFI-1~1.2\bin\..\run\nifi.pid; this may allows others to have access to the key needed to communicate with NiFi. Permissions should be changed so that only the owner can read this file
Under system diagnostics in Nifi, it reads Heap(62%).
I would really appreciate if someone can assist me with resolving this issue. How do i proceed?