Member since
08-22-2018
12
Posts
0
Kudos Received
0
Solutions
10-16-2018
06:47 PM
In our pipeline we are using PutDatabaseRecord to insert into database. we observed performance bottleneck in PutDatabaseRecord. The flow file is getting queued up onPutDatabaseRecord. Task/Time for PutDatabaseRecord is as follow 167,281 / 01:46:52.678. Total 45 connections configured for DBCP connection pool service. is it recommended to have MergContent before PutDatabaseRecord to batch flow file together? what will happe if any one of the flow file fails? will the whole batch is getting rolled back? Please let me know. Thanks Subbu
... View more
Labels:
- Labels:
-
Apache NiFi
10-08-2018
06:52 PM
Can you upload or share the sample workflow for PutDatabaseRecord to insert the flow file content into database. Thanks Subbu
... View more
10-06-2018
04:50 AM
Thanks Matt for your reply. The whole flow file content have to be inserted into database. is there way to configure JsonTreeRecordReader to indicate whole flow file content as a value in prepared statement? Regards Subbu
... View more
09-25-2018
06:19 AM
I have data pipeline to consume message from Kafka and insert into oracle database. The message from Kafka is in JSON format. If any error occurs during the processing then insert whole message (flowfile content) to invalid payload table. The trimmed version of the pipeline is as follows GenerateFlowFile => ReplaceText => PutSQL. clob-insert-test.xml The table structure is as follows CREATE TABLE CLOB_TEST ("TRAN_ID" VARCHAR2(36 BYTE) NOT NULL ENABLE, "PAYLOAD" CLOB NOT NULL ENABLE, CONSTRAINT "CLOB_TEST" PRIMARY KEY ("TRAN_ID") ) Payload is more than 4000 characters. Creating insert SQL using replacetext failed and error is as follows Error report - SQL Error: ORA-01704: string literal too long 01704. 00000 -"string literal too long" *Cause:The string literal is longer than 4000 characters. *Action: Use a string literal of at most 4000 characters. Longer values may only be entered using bind variables. is there any other option to bind CLOB column and insert to the table. Any help will be greatly appreciated. Thanks. screen-shot-2018-09-24-at-111751-am.png
... View more
Labels:
- Labels:
-
Apache NiFi
08-24-2018
06:59 PM
JoltTransformJSON processor is used in our data pipeline.The Jolt Specification in the data pipeline contains two operations (shift and default). shift operation to translates Json fields from input message into database fields and default operation to read from flow file attribute to database field. The performance was good when we just had jolt_shift operation but the jolt_default operation decreases the performance. The Transform Cache Size is set 10000 but still we see the performance issue. consumeKafka -> JoltTransformJSON -> putDatabaseRecord Jolt specification [{ "operation": "shift", "spec": { "studentName":"STUDENT_NAME", "Age":"AGE", "address_city":"CITY", "address1":"ADDRESS1", "zipcode":"POSTLCODE", "id":"ID" } },{ "operation": "default", "spec":{ "PRTN_NBR" : "${kafka.partition}" } }] Input message [{"studentName":"Foo2","Age":"12","address_city":"newyork","address1":"North avenue","zipcode":"123213","id":"103"}] Please find attached summary of Total Task Duration and FlowFiles in 5 min. Any suggestions or any other alternatives? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache NiFi
08-24-2018
05:30 AM
Thanks you very much for the solution.The solution worked like charm. I applied the DDL in postgres CREATE SEQUENCE id_seq START 101; The pipeline generated the sequence.nextvalue in the id column. [{"studentName":"Foo","Age":"12","address_city":"newyork","address1":"North avenue","zipcode":"123213","id":"101"},{"studentName":"Foo1","Age":"12","address_city":"newyork","address1":"North avenue","zipcode":"123213","id":"102"},{"studentName":"Foo2","Age":"12","address_city":"newyork","address1":"North avenue","zipcode":"123213","id":"103"}] Thanks again!!!
... View more
08-22-2018
03:59 PM
Thanks for the suggestion. I will try the solution in blog post and post my comments.
... View more
08-22-2018
03:55 PM
Thanks for the reply. The requirement for the data pipeline is to have guaranteed data delivery. The nextint() seems to be not guaranteed to be unique across a cluster which will result in unique constraint exception. The solution may not optimal for our use case. Thanks again for your suggestion.
... View more
08-22-2018
01:09 PM
I have data pipeline to consume message from Kafka and insert into oracle database consumeKafka -> JoltTransformJSON -> putDatabaseRecord The oracle table structure is as follows CREATE TABLE Persons ( ID NUMBER NOT NULL ENABLE, LastName varchar(255) NOT NULL, FirstName varchar(255), Age int, CONSTRAINT "Persons_PK" PRIMARY KEY ("ID") ); To insert a new record into the "Persons" table, we will have to use the nextval function.Json payload does not have the value for ID column. is there any option in putDatabaseRecord processor or any other processor to have seq_person.nextval in the insert statement?
... View more
Labels:
- Labels:
-
Apache NiFi