Created 02-01-2023 03:49 PM
I am using a ForkEnrichment - JoinEnrichment combo to enrich some data. Within the JoinEnrichment processor, the "wrapper" and "sql" options for the "Join Strategy" both work fine. However, the "insert enrichment fields" option doesn't. I prefer this option because I want my enrichment data to be a nested dataset off one of the root fields. The field exists in the original data, and is properly referenced in the "Insertion Record Path" field.
So... I'm scratching my head trying to figure out why this isn't working.
Anyone experience this or have any ideas?
Created 02-01-2023 07:42 PM
Hi,
Can you provide sample data with the expected output? You have to be careful with the wrapper and insertion strategies as the data is matched based on the record index from both location. Which means that data has to be ordered correctly in case you are processing multiple records at a time.
Created 02-02-2023 09:12 AM
Hi @SAMSAL ,
Ugh! That is a little disconcerting to hear about the "wrapper" and "insertion" strategies. I resorted to using the "wrapper" strategy and then follow up with a Jolt transform to get the data into the format I want. However, if you think using SQL is a more bullet proof option then I'm open to that if you know how to write a SQL that uses RecordPath to put the data into the structure I need.
Here is an example dataset, which gives me the exact same outcome. The incoming flow file is OpenSky flight data, and the primary key is "hex". Using an UpdateRecord processor, I add the field
"aircraft": "placeholder"
because the JoinEnrichment processor requires that the field already exist before inserting the enrichment data.
The second dataset is my enrichment data, which is a record in a large dataset called UnitedAirlinesFleet, and "icao24" is the foreign key. So "hex" and "icao24" are the two keys.
I want to do something very simple, which is to insert the enrichment data into flight data as a record where the field name is "aircraft", as shown in the third dataset below. I believe this is precisely what the insertion strategy is designed to do.
I really appreciate you SAMSAL and your expertise. Your suggestion to use the ForkEnrichment and JoinEnrichment was really helpful. I subsequently found a Cloudera video on this topic and that also helped a lot with how to structure the flow. Thank you.
{
"hex" : "a39fb1",
"flight" : "UAL798 ",
"alt_baro" : 23000,
"alt_geom" : 23200,
"gs" : 415.2,
"track" : 275.9,
"baro_rate" : 2432,
"squawk" : "1467",
"emergency" : "none",
"category" : "A3",
"nav_qnh" : 1013.6,
"nav_altitude_mcp" : 26016,
"nav_heading" : 270.0,
"lat" : 39.969681,
"lon" : -105.610657,
"nic" : 8,
"rc" : 186,
"seen_pos" : 1.1,
"version" : 2,
"nic_baro" : 1,
"nac_p" : 9,
"nac_v" : 1,
"sil" : 3,
"sil_type" : "perhour",
"gva" : 2,
"sda" : 2,
"mlat" : [ ],
"tisb" : [ ],
"messages" : 876,
"seen" : 1.1,
"rssi" : -1.8,
"aircraft" : "placeholder"
}
{
"icao24" : "a39fb1",
"registration" : "N33203",
"manufacturericao" : "BOEING",
"manufacturername" : "Boeing",
"model" : "737-824",
"typecode" : "B738",
"serialnumber" : 30613,
"linenumber" : null,
"icaoaircrafttype" : "L2J",
"operator" : null,
"operatorcallsign" : "UNITED",
"operatoricao" : "UAL",
"operatoriata" : null,
"owner" : "Wells Fargo Trust Co Na Trustee",
"testreg" : null,
"registered" : null,
"reguntil" : "2023-07-31",
"status" : null,
"built" : "2000-01-01",
"firstflightdate" : null,
"seatconfiguration" : null,
"engines" : "CFM INTL. CFM56 SERIES",
"modes" : false,
"adsb" : false,
"acars" : false,
"notes" : null,
"categoryDescription" : "No ADS-B Emitter Category Information"
}
{
"hex" : "a39fb1",
"flight" : "UAL798 ",
"alt_baro" : 23000,
"alt_geom" : 23200,
"gs" : 415.2,
"track" : 275.9,
"baro_rate" : 2432,
"squawk" : "1467",
"emergency" : "none",
"category" : "A3",
"nav_qnh" : 1013.6,
"nav_altitude_mcp" : 26016,
"nav_heading" : 270.0,
"lat" : 39.969681,
"lon" : -105.610657,
"nic" : 8,
"rc" : 186,
"seen_pos" : 1.1,
"version" : 2,
"nic_baro" : 1,
"nac_p" : 9,
"nac_v" : 1,
"sil" : 3,
"sil_type" : "perhour",
"gva" : 2,
"sda" : 2,
"mlat" : [ ],
"tisb" : [ ],
"messages" : 876,
"seen" : 1.1,
"rssi" : -1.8,
"aircraft" : {
"icao24" : "a39fb1",
"registration" : "N33203",
"manufacturericao" : "BOEING",
"manufacturername" : "Boeing",
"model" : "737-824",
"typecode" : "B738",
"serialnumber" : 30613,
"linenumber" : null,
"icaoaircrafttype" : "L2J",
"operator" : null,
"operatorcallsign" : "UNITED",
"operatoricao" : "UAL",
"operatoriata" : null,
"owner" : "Wells Fargo Trust Co Na Trustee",
"testreg" : null,
"registered" : null,
"reguntil" : "2023-07-31",
"status" : null,
"built" : "2000-01-01",
"firstflightdate" : null,
"seatconfiguration" : null,
"engines" : "CFM INTL. CFM56 SERIES",
"modes" : false,
"adsb" : false,
"acars" : false,
"notes" : null,
"categoryDescription" : "No ADS-B Emitter Category Information"
}
}