Created 04-21-2017 07:23 PM
Hi,
I would like to extract a big table (MySQL, more than 3 millions rows) and to write it as a file in HDFS.
What would be the best way to do it ?
I tried the following processors :
- ExecuteSQL : error : pb memory
- QueryDatabaseTable : error : pb memory
- GenerateTableFetch : error : failed to invoke @OnScheduled method due to java.lang.RuntimeException
I have 20 Go of memory.
What would be the best way to do it ? Can I set up parameters so that I generate more than 1 DataFlow, then merge in NiFi before loading to HDFS ?
Thank you.
Created 05-05-2017 08:03 PM
What are your JVM memory settings? The standard is 512MB, which will likely result in OOM with a large query result set. Best to give NIFI as much memory as possible if you plan to do a lot of in memory workload, like working with large result sets in this case.
Created 05-08-2017 01:20 PM
yes settings are 512MB in NiFi for nifi.initial_mem and nifi.max_mem.
Is there a way to set the best values for these parameters ? Like 1/2 * Amout of RAM ?
Created 05-08-2017 01:26 PM
Ah, 512MB is probably to low for your use case. If you don't have a lot of other services running on your node, I would suggest to start with 80% of the node memory.
Created 05-08-2017 02:38 PM
Thank you @Ward Bekker
Created 05-08-2017 02:28 PM
I like 12 - 16 GB for NiFi. That's a nice chunk of RAM.