Created 10-28-2016 01:24 PM
Hi
I have three tables in MySQL. Employee_Info, Address, Phone. The Employee table is the primary table with address and phone being related to this table via the foreign key relationship.
I want to create a single Json file which will contain the employee records with the address and phone as arrays in each record.
How do I do this? I can read the data using Execute SQl but how do I access the separate flow files at the same time to merge them into one. then after that how will i do the merging? I am not familiar with Jolt. Can transform jolt json processor be used somehow. or do i simple write a new custom processor?
Regards
Arsalan
Created 10-28-2016 05:00 PM
Do they need to be separate fetches? If you use a single ExecuteSQL with JOINs for the foreign keys, you can get a single result set (in Avro), then use ConvertAvroToJSON to convert to a single JSON object.
If they must be in different flow files, there is currently no "MergeJSON" processor, although that would be a great contribution if you're interested in writing a full processor. An alternative is to use ExecuteScript or InvokeScriptedProcessor. In either case, keep in mind that NiFi employs a flow-based paradigm, so merging arbitrary incoming flow files can be tricky. This is done in some splitting processors (such as SplitText) by setting "fragment.id", "fragment.count", and "fragment.index" attributes on the flow files, so a downstream "merging" processor can handle these micro-batches by merging together all files with the same fragment.id. I've got an example of this kind of merging processor as a pull request for NIFI-2735. If you're using a scripting processor and just want to solve this one specific issue, you could assume that you will only get 3 incoming flow files and merge them accordingly. This is fragile but could work for your use case.
Created 10-28-2016 05:00 PM
Do they need to be separate fetches? If you use a single ExecuteSQL with JOINs for the foreign keys, you can get a single result set (in Avro), then use ConvertAvroToJSON to convert to a single JSON object.
If they must be in different flow files, there is currently no "MergeJSON" processor, although that would be a great contribution if you're interested in writing a full processor. An alternative is to use ExecuteScript or InvokeScriptedProcessor. In either case, keep in mind that NiFi employs a flow-based paradigm, so merging arbitrary incoming flow files can be tricky. This is done in some splitting processors (such as SplitText) by setting "fragment.id", "fragment.count", and "fragment.index" attributes on the flow files, so a downstream "merging" processor can handle these micro-batches by merging together all files with the same fragment.id. I've got an example of this kind of merging processor as a pull request for NIFI-2735. If you're using a scripting processor and just want to solve this one specific issue, you could assume that you will only get 3 incoming flow files and merge them accordingly. This is fragile but could work for your use case.
Created 10-28-2016 05:14 PM
Could something be down with JoltTransformJSON
could you augment the one main flow by update attribute that reads from DistributedMapCache if it's a small lookup: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.PutDistributed...
I like the idea of a mergeJSON processor though