Member since
08-21-2024
5
Posts
2
Kudos Received
0
Solutions
09-18-2024
04:39 PM
1 Kudo
Hi , well lets start with the last thing you have said because Im carious how its happening that one records override the other record. I assume you are using ExcelReader , correct? If that is the case I have tried creating an excel with two sheets that share same schema and both sheers have the same records as follows: Sheet1: Address1 Sheet2: Address2 I read the file using FetchFile processor then I passed the content to ConvertRecord Processor where the reader is ExcelReader and the Writer JsonRrecordSetWriter configured as follows: ExcelReader: Im passing Avro schema to assign proper field name and type as follows: {
"namespace": "nifi",
"name": "user",
"type": "record",
"fields": [
{ "name": "ID", "type": "int" },
{ "name": "NAME", "type": "string" },
{ "name": "ADDRESS", "type": "string" }
]
} For JsonRecordSetWriter I used with the default settings no changes. Here is how my output looked like which account for all the records from both sheets even with duplicates (1, sam, TX): [ {
"ID" : 1,
"NAME" : "sam",
"ADDRESS" : "TX"
}, {
"ID" : 2,
"NAME" : "Ali",
"ADDRESS" : "WA"
}, {
"ID" : 1,
"NAME" : "sam",
"ADDRESS" : "TX"
}, {
"ID" : 2,
"NAME" : "Ali",
"ADDRESS" : "FL"
} ] So Im carious what is happening in your case so that the record is overwritten. Maybe if we figure out this problem we can solve it as you said by using JOLT. I think Fork\Join Enrichment will still work specially since you are reading the info that you are trying to merge from the same file but the flow is going to be different and you might need two ExcelReader for each sheet which means that you need to read the excel twice. The flow will look like the following in a high level: You can avoid having two excel reader by passing the Sheet info as flowfile attribute. However you still need to read it twice for each sheet and then join the info based on the joining strategy that works best for you. Hope that helps.
... View more
09-18-2024
03:27 PM
1 Kudo
@Crags You can not have both your NiFi-Registries linked to the same Git repository. NiFi-Registry only pushes to the git repository. The only time NiFi-Registry would ever read from the Git Repository is on startup. So if you used two NiFi-Registries and and were committing changes by both, you can cause issues with what is getting committed to your Git repo. What is more common is to have a single NiFi-Registry which is utilized by multiple NiFi deployments. QA NiFi builds some flow and when that flow is ready for production, it is committed to the NiFi-Registry. That flow can the be imported from that single NiFi-Registry to the canvas of your PROD NiFi. Now both NiFi instances are tracking to same flow in same registry. You then start making local changes to that same version controlled Process Group (PG) in your QA NiFi. The PG will indicate you have local changes. you then have a couple choices on how you want to use your shared NiFi-Registry: Wait until you have completed making all your changes and testing in QA before committing the next version to the shared registry. At which time your Prod NiFi PG will indicate a newer version is now available in the shared NiFi-Registry. You can then update your prod to that new version. Incrementally commit updated versions of the PG to the shared registry. Your prod will show new version available, so you will want to create a process for what versions are prod ready to control when a new version is actually changed in your prod. About the UUID linkage... Your NiFi can have one or more defined registry clients and each of those defined registry clients gets an assigned UUID on the NiFi instance (will not be same UUID on every NiFi that sets up same registry client). NiFi stores everything on the canvas locally in the flow.json.gz file so it can be reloaded into NiFi heap on startup. When you start version control on a PG, the flow (gets uuid) is added to a NiFi-Registry bucket (has UUID). Locally on the NiFi within the flow.json.gz there is now a reference to a specific NiFi-Registry client (by its UUD), a specific bucket (by its UUID) and specific flow (by its UUID). Now considering scenario of a shared NiFi-Registry, the registry client on that NiFi will hav a different uuid even though it connects to same shared NiFi-Registry. So using the registry client, you import a flow that flow from NiFi-Registry to the NiFi canvas. Every component created from the import flow will get assigned UUIDs (will not match UUIDs assigned on other NiFi). Those differences in UUIDs are not tracked as changes. This is why if you stop version control, you can't start version control again and connect it back to an existing flow stored in NiFi-registry. You also can't delete the registry client and re-create it as it too would get a different UUID (NiFi blocks removing a registry client if any PG are currently using it for version control for this reason). --------- Another option is to have a separate NiFi-Registry for each environment. When you are ready to move a flow from NiFi-Registry 1 (QA) to NiFi-Registry 2 (Prod), go into your QA NiFi-Registry, locate the flow and from the "actions" menu select export version, and select the version you want to export. You can then go to your prod NiFi-Registry and "import new flow". Once imported you can go to yoru Prod NiFi and load that flow onto the canvas. Later when you are ready in QA with a new version to push to prod, you can again export the prod ready version. On prod NiFi-Registry, you can select the existing flow and from "actions" menu select "import new version". This will allow you to add this flow as next version in Prod. After doing so the version controlled PG(s) on your prod NiFi tracking against that flow will report a new version is available. This second option allows your have better control over what changes make it to your Prod deployment. You could also script rest-api calls to automate these steps if you wanted. ------ Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
08-22-2024
12:46 AM
Hey @SAMSAL - thanks for the detailed reply, and yep you hit the nail on the head ha ha. I've used a few of those links already, and as you said I was close, but not quite there. I grasp the concept, it's just the syntax of everything that was throwing me off. That being said, the JSLT example you posted does seem to make more sense to me also, so that's definitely something I'll look into. I spoke to a guy called Paul Lakus in the NiFi slack channel and he basically gave me almost exactly the same solution that you have posted and so I managed to work out how it was being done, but still a huge learning curve for me heh. Anyway this now seems to be working as intended, which I'm extremely grateful for so thank you for your input and the steer towards JSLT - definitely something to look into. Thanks!!
... View more