Member since
05-20-2022
40
Posts
4
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
602 | 10-17-2022 11:48 AM | |
805 | 08-03-2022 10:13 AM |
02-02-2023
07:03 PM
It seems this question gets asked a lot in this forum, but there aren't any good responses. Hopefully somebody can help me out here. I am attempting to use the InvokeHTTP processor to POST a query to an REST endpoint. I currently use the InvokeHTTP processor to access data on this data service using the GET method, and that always works fine. However, there is a specific service endpoint that only accepts POST requests. I've verified this service is functional using curl. But when I attempt to use the NiFi InvokeHTTP processor, I get no response. Hopefully somebody here knows what's going on. Below are some screen shots of my processor configuration. If I understand the documentation correctly, I need to specify in the "Attributes to Send" field the flowfile attributes I want to send as parameters. Since this processor is the first processor in the flow, it does't have any attributes yet, so I add a dynamic attribute named "data" (shown at the bottom). Then I reference that attribute in the "Attributes to Send" field. Seems correct per the documentation. But it don't work. In fact I get no response at all, not even a 400 series response. What am I missing here? This seems like it should be really simple, but yet is bedeviling me.
... View more
Labels:
- Labels:
-
Apache NiFi
02-02-2023
09:12 AM
Hi @SAMSAL , Ugh! That is a little disconcerting to hear about the "wrapper" and "insertion" strategies. I resorted to using the "wrapper" strategy and then follow up with a Jolt transform to get the data into the format I want. However, if you think using SQL is a more bullet proof option then I'm open to that if you know how to write a SQL that uses RecordPath to put the data into the structure I need. Here is an example dataset, which gives me the exact same outcome. The incoming flow file is OpenSky flight data, and the primary key is "hex". Using an UpdateRecord processor, I add the field "aircraft": "placeholder" because the JoinEnrichment processor requires that the field already exist before inserting the enrichment data. The second dataset is my enrichment data, which is a record in a large dataset called UnitedAirlinesFleet, and "icao24" is the foreign key. So "hex" and "icao24" are the two keys. I want to do something very simple, which is to insert the enrichment data into flight data as a record where the field name is "aircraft", as shown in the third dataset below. I believe this is precisely what the insertion strategy is designed to do. I really appreciate you SAMSAL and your expertise. Your suggestion to use the ForkEnrichment and JoinEnrichment was really helpful. I subsequently found a Cloudera video on this topic and that also helped a lot with how to structure the flow. Thank you. {
"hex" : "a39fb1",
"flight" : "UAL798 ",
"alt_baro" : 23000,
"alt_geom" : 23200,
"gs" : 415.2,
"track" : 275.9,
"baro_rate" : 2432,
"squawk" : "1467",
"emergency" : "none",
"category" : "A3",
"nav_qnh" : 1013.6,
"nav_altitude_mcp" : 26016,
"nav_heading" : 270.0,
"lat" : 39.969681,
"lon" : -105.610657,
"nic" : 8,
"rc" : 186,
"seen_pos" : 1.1,
"version" : 2,
"nic_baro" : 1,
"nac_p" : 9,
"nac_v" : 1,
"sil" : 3,
"sil_type" : "perhour",
"gva" : 2,
"sda" : 2,
"mlat" : [ ],
"tisb" : [ ],
"messages" : 876,
"seen" : 1.1,
"rssi" : -1.8,
"aircraft" : "placeholder"
} {
"icao24" : "a39fb1",
"registration" : "N33203",
"manufacturericao" : "BOEING",
"manufacturername" : "Boeing",
"model" : "737-824",
"typecode" : "B738",
"serialnumber" : 30613,
"linenumber" : null,
"icaoaircrafttype" : "L2J",
"operator" : null,
"operatorcallsign" : "UNITED",
"operatoricao" : "UAL",
"operatoriata" : null,
"owner" : "Wells Fargo Trust Co Na Trustee",
"testreg" : null,
"registered" : null,
"reguntil" : "2023-07-31",
"status" : null,
"built" : "2000-01-01",
"firstflightdate" : null,
"seatconfiguration" : null,
"engines" : "CFM INTL. CFM56 SERIES",
"modes" : false,
"adsb" : false,
"acars" : false,
"notes" : null,
"categoryDescription" : "No ADS-B Emitter Category Information"
} {
"hex" : "a39fb1",
"flight" : "UAL798 ",
"alt_baro" : 23000,
"alt_geom" : 23200,
"gs" : 415.2,
"track" : 275.9,
"baro_rate" : 2432,
"squawk" : "1467",
"emergency" : "none",
"category" : "A3",
"nav_qnh" : 1013.6,
"nav_altitude_mcp" : 26016,
"nav_heading" : 270.0,
"lat" : 39.969681,
"lon" : -105.610657,
"nic" : 8,
"rc" : 186,
"seen_pos" : 1.1,
"version" : 2,
"nic_baro" : 1,
"nac_p" : 9,
"nac_v" : 1,
"sil" : 3,
"sil_type" : "perhour",
"gva" : 2,
"sda" : 2,
"mlat" : [ ],
"tisb" : [ ],
"messages" : 876,
"seen" : 1.1,
"rssi" : -1.8,
"aircraft" : {
"icao24" : "a39fb1",
"registration" : "N33203",
"manufacturericao" : "BOEING",
"manufacturername" : "Boeing",
"model" : "737-824",
"typecode" : "B738",
"serialnumber" : 30613,
"linenumber" : null,
"icaoaircrafttype" : "L2J",
"operator" : null,
"operatorcallsign" : "UNITED",
"operatoricao" : "UAL",
"operatoriata" : null,
"owner" : "Wells Fargo Trust Co Na Trustee",
"testreg" : null,
"registered" : null,
"reguntil" : "2023-07-31",
"status" : null,
"built" : "2000-01-01",
"firstflightdate" : null,
"seatconfiguration" : null,
"engines" : "CFM INTL. CFM56 SERIES",
"modes" : false,
"adsb" : false,
"acars" : false,
"notes" : null,
"categoryDescription" : "No ADS-B Emitter Category Information"
}
}
... View more
02-01-2023
03:49 PM
I am using a ForkEnrichment - JoinEnrichment combo to enrich some data. Within the JoinEnrichment processor, the "wrapper" and "sql" options for the "Join Strategy" both work fine. However, the "insert enrichment fields" option doesn't. I prefer this option because I want my enrichment data to be a nested dataset off one of the root fields. The field exists in the original data, and is properly referenced in the "Insertion Record Path" field. So... I'm scratching my head trying to figure out why this isn't working. Anyone experience this or have any ideas?
... View more
- Tags:
- JoinEnrichment
- NiFi
Labels:
- Labels:
-
Apache NiFi
01-31-2023
11:01 AM
Thank you @SAMSAL I will investigate these processors; I had no idea they existed. Thank you!
... View more
01-31-2023
09:56 AM
Great questions @SAMSAL I would like to do something like storing the JSON in a parameter contexts variable then using a Lookup service to retrieve the corresponding record. I think of it as an in-memory table that I can use to perform inner joins with flow files. Alternatively, I could read the table from a remote source (using InvokeHTTP) and then load it into a DistributedMapCacheLookupService, but I'm not familiar with this approach so I'd have to do some research. I appreciate your time. Thank you.
... View more
01-30-2023
09:59 PM
I have a flow file that I want to enrich with the contents from a list of JSON values based on the "id". Basically perform an inner join of the flow file with this look-up data set. However, I need the look-up dataset to be in memory, then using the "id", query the look-up dataset and append the results to the current flow file. Here is an example: Incoming flow file: {
"id": "abc123",
"fname": "The",
"lname": "Rock"
} Contents of the look-up data set: [
{
"id": "abc123",
"dob": "03/09/1977",
"phone": "987-654-0001"
},
{
"id": "def765",
"dob": "04/08/1976",
"phone": "789-654-0001"
},
{
"id": "hij765",
"dob": "05/06/1975",
"phone": "456-654-0001"
}
] Enriched flow file: {
"id": "abc123",
"fname": "The",
"lname": "Rock",
"dob": "03/09/1977",
"phone": "987-654-0001"
} I need to be able to look up the correct record in the look-up dataset based on the "id" then append the values to the current flow file. The key here is that I need the look-up dataset to reside in memory (can't be a file or a database) Thanks for reading and look forward to hearing back with ideas.
... View more
Labels:
- Labels:
-
Apache NiFi
10-17-2022
11:48 AM
@MarioFRS Ah, you are correct. I overlooked that additional nested Product. That complicates things because the upper Product can be an array or not any array, which itself contains another ProductDetails.Product which also may or may not be an array. Ugh! To solve this I had to resort to chaining multiple transforms to account for all the possible formats that can exist. I may not have all the possible permutations you'll have, but you can follow the basic structure of these transforms and create new ones with additional levels if necessary. It is important to note that the inner-most portions of the nested data need transformed first, so the order of the transforms starts with the inner-most data and ends with the outer-most data. Keep that in mind if you need to add more transforms to account for additional permutations of the data. Give this a try and let me know how it works for you. [
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": {
"*": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"ClassSummary": "MANY"
}
}
}
},
{
"operation": "cardinality",
"spec": {
"*": {
"ClassSummary": "MANY"
}
}
}
]
... View more
10-14-2022
04:23 PM
Hey @MarioFRS, Just following up to make sure things are good to go. If you are all set then please accept the solution to ensure it helps others in the future.
... View more
10-13-2022
02:16 PM
Based on this screen shot it appears you are using the wrong Jolt Transform. Try using the newest version that I sent earlier this morning. I've also attached screen shots which show the Jolt works with your data. I trimmed your data up a little for brevity purposes, but the structure is still the same.
... View more
10-13-2022
12:26 PM
Hmm. I tried all the use cases (single object, array, no array) and this transform works with all the example data that you've posted. Can you send a screen shot of what you mean; perhaps I don't understand the problem.
... View more
10-13-2022
08:52 AM
Thanks @SAMSAL, I was tracking that previous post but since it was so old I was hoping newer versions of NiFi could address this. This is such a simple thing with XML and the EvaluateXPath processor, I am just surprised that we can't do something similar with JSON. Oh well, 2 processors it is.
... View more
10-13-2022
08:33 AM
This one works. Give it a try... [
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"ClassSummary": "MANY",
"*": {
"*": {
"Product": "MANY"
}
}
}
}
}
}
]
... View more
10-13-2022
01:40 AM
"Product" is nested several layers deep so the first thing you need to do is match the wildcard (*) character to match the depth of where "Product" is located. As a simplified example, I trimmed and flattened your JSON data down a little, and the transform works as expected. Where I ran into a challenge was with the "ClassSummary" array, because I don't know how to account for it in the structure. Hopefully this gives you something to work with, and using this web site you get some ideas on how to take it further. I'm very curious to see the result. https://jolt-demo.appspot.com/#ritwickgupta [
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
] {
"costc": 9638,
"sum_amount": 543,
"Classes": {
"class": 1102,
"sum_amount": 117,
"sum_invitation": 0,
"ProductDetails": {
"Product": {
"id": 7992160,
"artnr": 32212,
"sum_amount": 16,
"sum_invitation": 0,
"value": null
}
}
},
"date": "2022-10-06"
}
... View more
10-12-2022
04:27 PM
I need to use wildcard syntax in a JSON path to return the color of the bike as a scalar value. However, the problem is that when I use wildcard syntax with the EvaluateJSONPath processor it always returns an array. For example This path returns a scalar value: $.store.bicycle.color ==> red This path returns an array: $..bicycle.color ==> ['red'] Is it possible to use the EvaluateJSONPath processor with wildcards and get a scalar value in return? Or, is it possible to extract the value from the array without using an additional processor? {
"store": {
"bicycle": {
"color": "red",
"price": 19.95
}
},
"expensive": 10
}
... View more
Labels:
- Labels:
-
Apache NiFi
10-12-2022
03:37 PM
1 Kudo
Try this... [
{
"operation": "cardinality",
"spec": {
"*": {
"*": {
"Product": "MANY"
}
}
}
}
]
... View more
10-06-2022
11:42 AM
Brilliant! That does the trick. Thanks!
... View more
10-06-2022
08:20 AM
Are you suggesting it is possible to perform a complete swap of the flow file content using a ReplaceText processor? For example: If the flow file = "ababababac" Can I use a single ReplaceText to create <xml>ababababac</xml>? From everything I've read in the docs you can't use the Expression Language to get at the whole of the flow file content, e.g. ${flowFile} So instead what I do is use 2 ReplaceText processors successively; the first one performs a "Prepend", and the second one performs an "Append". This technique just bookends (brackets) the flow file content. For example: Base64Encode ReplaceText (prepend) ReplaceText(append) ababababac ==> <xml>ababababac ==> <xml>ababababac</xml> But, if you know of a way to do this with a single processor then I'd love to hear your suggestions. Thanks!
... View more
10-04-2022
04:49 PM
How can I use the output of the Base64EncodeContent (or the Base64Encode function) as input for an XML template, where the flow file content (not an attribute) is the XML with the encoded value included? There doesn't seem to be a NiFi reader that reads the encoded flow file content, so I'm not sure how best to use this processor aside from using a couple of ReplaceText processors to bracket XML substrings around it. I appreciate any input. Thanks for reading. example: Initial Flow File: "The quick brown fox jumped over the lazy dog" Encoded Flow File: VGhlIHF1aWNrIGJyb3duIGZveCBqdW1wZWQgb3ZlciB0aGUgbGF6eSBkb2c= <?xml version="1.0" encoding="UTF-8" ?>
<a xmlns="some/name/space/goes/here">
<b>This is a template</b>
<c>ENCODED DATA GOES HERE</c>
</a>
... View more
Labels:
- Labels:
-
Apache NiFi
09-27-2022
12:30 PM
When converting XML to JSON, and using an Avro schema to specify what that JSON should look like, how does NiFi handle instances of a single XML child element versus multiple child elements when doing the conversion? For example: Given the following XML: <employees>
<employee>
<name>John Doe</name>
<addresses>
<work>
<number>123</number>
<street>5th Avenue</street>
<city>New York</city>
<state>NY</state>
<zip>10020></zip>
</work>
<home>
<number>456</number>
<street>Elm Street</street>
<city>Queens</city>
<state>NY</state>
<zip>10023></zip>
</home>
</addresses>
</employee>
<employee>
<name>Bob Smith</name>
<addresses>
<home>
<number>987</number>
<street>Oak Road</street>
<city>Staten Island</city>
<state>NY</state>
<zip>10030></zip>
</home>
</addresses>
</employee>
</employees> The first employee, John Doe, has two address which NiFi converts to a JSON array, good to go. However, the second employee, Bob Smith, works from home so he has only one address. However, the "address" field is an array so Bob Smith's addresses needs to be a one element array. By using an avro schema during the write operation the, the ConvertRecord processor throws an error when it encounters a record like Bob Smith rather than creating it as a single value array. How do I configure NiFi to use the schema to define the JSON output so I can ensure Bob Smith's addresses are captured as single value arrays? Thanks for the support!
... View more
- Tags:
- conversion
- json
- NiFi
Labels:
- Labels:
-
Apache NiFi
09-27-2022
11:26 AM
Brilliant! Exactly what I was looking for. Although it seems a little peculiar to me that we need to rely on a Jolt transform for this operation and not the UpdateRecord processor. Particularly since NiFi makes it a point to discuss Arrays and Maps in the documentation. Thanks for the Jolt transform because I spent a lot of time trying to get the Jolt transform to work and couldn't quite figure it out. Now I see what I was doing wrong.
... View more
09-26-2022
07:26 PM
If I have a flow file with the following JSON how can I 1) Evaluate "addresses" to determine if it is of type Array or type Map 2) If type Map then convert "addresses" into an array using native NiFi capabilities (i.e. no string parsing)? {
"name": "John Doe",
"addresses": {
"work": {
"number": "123",
"street": "5th Avenue",
"city": "New York",
"state": "NY",
"zip": "10020"
}
}
} This is what I need it to look like: {
"name": "John Doe",
"addresses": [{
"work": {
"number": "123",
"street": "5th Avenue",
"city": "New York",
"state": "NY",
"zip": "10020"
}
}]
} I appreciate the input and support! Thank you.
... View more
Labels:
- Labels:
-
Apache NiFi
09-21-2022
09:44 PM
Does NiFi offer a way to generate XSD schemas from an XML similar to the way it generates Avro schemas from JSON?
... View more
Labels:
- Labels:
-
Apache NiFi
09-19-2022
10:03 PM
Thanks @SAMSAL and @araujo for the responses. The RouteOnAttribute is what I am using presently but it gets unwieldily after just a couple of route options. Looks like I'm just gonna need to build a custom validator using the ExecuteScript processor. Hopefully that scales.
... View more
09-18-2022
05:28 PM
Thank you SAMSAL for the reply. Ordinarily you would be correct, however, the ValidateXML processor does things differently. If my flowfile has an attribute named "schema.name" and I use the following expression language: ${schema.name:prepend('/opt/nifi/schemas/xsd/'):append('.xsd')} ...then I get the following error. It seems the ValidateXML processor doesn't actually support dynamic run-time assignment of variables. Even using the variable registry doesn't solve the problem because the path/filename variable needs to resolve at design time. Perform Validation. Component is invalid: 'Schema File' validated against '/opt/nifi/schemas/xsd/.xsd' is invalid because The specified resource(s) do not exist or could not be accessed: [/opt/nifi/schemas/xsd/.xsd] Hopefully there is something I'm missing, otherwise I'll have to use the ExecuteScript to build my own validation routine. @ChuckE wrote: I have about 25-30 XML message types and each message type has its own XSD. I need to validate each message against their respective XSD. When using the ValidateXML processor is there any way to dynamically assign the appropriate XSD to a flow file based on on attribute value? I don't see the purpose/benefit of using so-called variables when said variables aren't even variable--they are STATIC! Why does this processor ONLY use variable_registry variables and not attribute values like every other processor in NiFi?
... View more
09-16-2022
10:14 PM
I have about 25-30 XML message types and each message type has its own XSD. I need to validate each message against their respective XSD. When using the ValidateXML processor is there any way to dynamically assign the appropriate XSD to a flow file based on on attribute value? I don't see the purpose/benefit of using so-called variables when said variables aren't even variable--they are STATIC! Why does this processor ONLY use variable_registry variables and not attribute values like every other processor in NiFi?
... View more
Labels:
- Labels:
-
Apache NiFi
09-14-2022
10:22 AM
There seems to be an inconsistency in the JOLT Transform Processors, but maybe I'm just missing something, and hopefully someone can shed some light on my confusion. Below is a really simple JSON dataset and an equally simple JOLT transform, which works as expected on the Jolt-Transform-Demo site (https://jolt-demo.appspot.com/). Incoming JSON: {
"loaf1": {
"type": "white"
},
"loaf2": {
"type": "wheat"
}
} JOLT spec: [{
"operation": "shift",
"spec": {
"*": "bread.&"
}
}] As you can see I've declared the operation as a "shift", and there is only a single "spec". When using the JoltTransformJSON processor, I set the Jolt Transformation DSL = "shift", then the specification fails to validate. Why? However, if I remove the square braces "[ ]" from the spec then the specification will successfully validate. However, it doesn't properly transform the data, and return "null". Why? But... If I leave the square braces in the spec, but change the Jolt Transformation DSL = "chain" then the validation works and it correctly transforms the data. Why does this need to be set to "chain" when there is only a single spec in the specification? I appreciate any insight into the behavior of the JOLT processors.
... View more
Labels:
- Labels:
-
Apache NiFi
08-03-2022
07:59 PM
1 Kudo
I create an XMLRecordSetWriter in the Controller Services, then using a ConvertRecord processor I'm able to read the xml record and then immediately write it out with a new root tag, which I can then pass to my next processor. I discovered this when I was reading the documentation for the XMLRecordSetWriter. Very first line in the documentation. 😃 https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-record-serialization-services-nar/1.7.0/org.apache.nifi.xml.XMLRecordSetWriter/index.html
... View more
08-03-2022
10:13 AM
1 Kudo
I've since discovered a super easy way to resolve this. Simply using the XMLRecordSetWriter does EXACTLY what I was looking for.
... View more
08-02-2022
10:41 PM
1 Kudo
This seems like a good idea. I'll give this a try and test the performance against the XSLT transform. I've never used the JOLT processors before so this will be a good opportunity to experiment with one. Thanks for the idea.
... View more