Member since
03-06-2024
5
Posts
3
Kudos Received
0
Solutions
04-08-2024
06:55 AM
I have Nifi 1.25 on Ubuntu EC2 (m6i.2xlarge) = (8CPU/32GB RAM) Bootstrap.conf was set like the following: java.arg.2=-Xms2g java.arg.3=-Xmx24g java.arg.13=-XX:+UseG1GC <not sure if needed even> ExecuteSQL is set like the following: The flow is getting about 50-60 tables from mySQL and start creating flow files from them , upload to S3 and then copy into Redshift. Seems like the ExecuteSQL is consuming high CPU & RAM. The java.arg.3=-Xmx24g is the upper limit for the RAM? How can I control the CPU?
... View more
Labels:
- Labels:
-
Apache NiFi
03-11-2024
02:09 AM
1 Kudo
I have ExecuteSQL running against mysql as source and the source table structure as is follow: `disable_min_bid` is set as tinyint(1) DEFAULT '0' but once running the ExecuteSQL looking at the AVRO files the datatype + data is changed into boolean (true/false): Is there any way to ignore this conversion? This is how my ExecuteSQL is configured right now:
... View more
Labels:
- Labels:
-
Apache NiFi
03-06-2024
10:32 PM
1 Kudo
That's the redshift connection: That's the mySQL connection: I do understand that INSERTs on redshift might be slow and that's why people are using COPY command but 100 rows in 5min is weird. I know that for BigQuery for example there's a dedicated processor in order to load data but couldn't find anything for RedShift so that's why I am using putDatabaseRecord. Edit - After I loaded 100 rows I exported the same 100 rows into an INSERT SQL statement and run it directly on the redshift, this task took 1.1 second. Edit2 - Since im the kind of guy that loves doing my job and to find solution I found this - reWriteBatchedInserts=true;reWriteBatchedInsertsSize=102400; I found this on some reddit post by chance, this is working way better now but I couldn't find some details regard those parameters to make them optimized and still that's not the best (still little bit slow). I am not sure that's the solution so i'm still waiting for your input as well. Also, if I would like to change it to drop the Avro/flow files into S3 and then execute COPY command from NIFI is it possible?
... View more
03-06-2024
06:10 AM
Hi there, I am experiencing really slow ingestion with putDatabaseRecord process. The flow is simple - query data from mySQL 5.6 and ingesting it into redshift. Currently im just testing it so I list only specific tables from those 2 databases and then I query 'select * from $db.tablename' - but the ingestion using putDatabaseRecord is terribly slow (100 rows 5min). I am using AvroReader inside putDatabaseRecord but didnt make any changes to the avroreader - I just make sure flow files be small (100 rows). Does AvroReader split the 100 rows into 100 SQL statements? How do I fix this to have better performance?
... View more
Labels:
- Labels:
-
Apache NiFi