Created on 11-28-2024 04:49 AM - edited 11-28-2024 04:51 AM
I have a text file which needs to be fetched from remote server through SFTP. File will be having 26k+ rows with | separated values like below:
I have used below processor's to do the data manipulation. But facing “Processing halted.Out of memory heap” error after splitJSON.
Please help with the better approach to handle this case.
Created 11-28-2024 06:22 AM
Hi,
Its seems like you are running out of heap memory when adding new attributes through the evaluateJsonPath processor. Attributes are stored in the heap and you should be aware not to store large data in flowfile attributes if you are going to have so many flowfiles in order to avoid running into such issue.
Can you please elaborate on what are you trying to accomplish after converting Avro To Json? to me it doesnt make sense what you are doing because you are merging towards the end which means you might not even get the attribute you are extracting depending on how you set the Attribute Strategy in the MergeRecord processor.
Created 11-29-2024 05:19 AM
Created 11-29-2024 05:26 AM
Created 11-29-2024 04:58 AM
@SAMSAL :
Created 11-29-2024 05:14 AM
Created on 11-29-2024 05:15 AM - edited 11-29-2024 05:17 AM
Created 11-30-2024 11:42 AM
Hi @Vikas-Nifi ,
I think can avoid a lot of overhead such as writing the data to the DB for just doing the transformation and assigning the fixed width (unless you need to store the data in the DB). You can use processors like QueryRecord, UpdateRecord to do the needed transformation of data in bulk vs one record at a time and one field at a time. In QueryRecord you can use SQL like function based on apache calcite sql syntax to make transformation or derive new columns just as if you are doing mysql query. UpadateRecord also you can use Nifi Record Path to traverse fields and apply functions in bulk vs one record at a time. There is also a FreeFormTextRecordSetWriter service that you can use to create custom format as an output. For example in the following dataflow, Im using ConvertRecord process with CSVReader and FreeFormTextRecordSetWriter to produce desired out:
The GenerateFlowFile processor is used to create the input CSV in flowfile:
The ConvertRecord is configured as follows:
The CSVReader you can use default configuration.
The FreeFormTextRecordSetWriter is configured as follows:
In the Text Property you can use the columns\fields names as listed in the input and provided to the reader . You can also use Nifi Expression Language to do proper formatting and transformation to the written data as follows:
${DATE:replace('-',''):append(${CARD_TYPE}):padRight(28,' ')}${CUST_NAME:padRight(20,' ')}${PAYMENT_AMOUNT:padRight(10,' ')}${PAYMENT_TYPE:padRight(10,' ')}
This will produce the following output:
20241129Visa Test1 0.01 Credit Card
20241129Master Test2 10.0 Credit Card
20241129American Express Test3 500.0 Credit Card
I know this not 100% what you need but it should give you an idea what you need to do to get the desired output.
Hope that helps and if it does, please accept the solution.
Let me know if you have any other questions.
Thanks