Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

SQL queries via custom groovy script causing unexpected OutOfMemory errors

Highlighted

SQL queries via custom groovy script causing unexpected OutOfMemory errors

New Contributor

Hi,

We have a NiFi usecase including SQL queries on remote postgreSQL database using JDBC connection.

The goal is getting rows from database, then writing them into outputStream as CSV with delimiter specified as an argument. We have benchmarked two solutions :

-> ExecuteSQL + ExtractAvro + ConvertRecord processors

-> Custom groovy script found there : http://funnifi.blogspot.com/2016/04/sql-in-nifi-with-executescript.html adapted to our usecase

import org.apache.nifi.controller.ControllerService
import groovy.sql.Sql
import java.nio.charset.StandardCharsets
def lookup = context.controllerServiceLookup
def dbServiceName = 'OUR_DB_SERVICE_NAME' 
def dbcpServiceId = lookup.getControllerServiceIdentifiers(ControllerService).find { 
cs -> lookup.getControllerServiceName(cs) == dbServiceName
}
def conn = lookup.getControllerService(dbcpServiceId)?.getConnection()
def flowFile = session.get()
if(!flowFile) return
try {
flowFile = session.write(flowFile, {inputStream, outputStream ->
def sql = new Sql(conn)
sql.rows(flowFile.getAttribute('REQUETE')).eachWithIndex { row, idx ->
outputStream.write((row.values().join(flowFile.getAttribute('DELIMITER')) + "\n").getBytes())
} 
}as StreamCallback)
  session.transfer(flowFile, REL_SUCCESS)
} catch(e) {
log.error('Scripting error', e)
session.transfer(flowFile, REL_FAILURE)
}
conn?.close() 

The groovy script is faster than native ExecuteScript + ExtractAvro + ConvertRecord NiFi processors, but causes OutOfMemory errors while the first solution does not, and we have not been able to find out why this occurs yet.

Does someone have an idea about what in the script may be responsible of this behaviour (OOM), and how to avoid it?

Thanks