Created on 04-26-2018 01:02 PM - edited 08-18-2019 02:12 AM
I created a workflow in NiFi 1.5.0 that reads a XML file from HDFS. After splitting the file into separate <Transaction> elements, I want to read out an attribute's value and afterwards react by this value.
My original XML looks like this:
<?xml version="1.0" encoding="UTF-8"?> <Log> <Transaction Type="1" TrainingModeFlag="true"> <StoreID>240041</StoreID> ... </Transaction> </Log>
My Workflow splits the XML via SplitXML in depth 1, so after this processor I have this sub-xml:
<?xml version="1.0" encoding="UTF-8"?> <Transaction Type="1" TrainingModeFlag="true"> <StoreID>240041</StoreID> ... </Transaction>
I want to extract the value of the Type attribute of the Transaction tag, but it doesn't work for me. Here my EvaluateXPath processor:
My new content attribute has an empty string set, instead of showing the value 1:
Using the XPath //@Type works, but I need the exact path, as the Type attribute can occur in sub-elements.
Can someone help?
Created 04-26-2018 02:46 PM
I can't reproduce this, I used GenerateFlowFile with your input XML (adding two Transactions) -> SplitXML (level 1) and got the same "sub-xml" you did, then I used the same settings for EvaluateXPath and my content attribute has the correct value of 1. The only way I got it to show "Empty string set" is when I used /Transaction/@type as the XPath (note the wrong case for Type/type), is it possible there's a typo or case-sensitivity issue between your input XML and the XPath?
Created 04-26-2018 02:51 PM
Thanks for your fast answer! I checked the settings and there are definitivly no upper/lower case problems. I just saw that the NiFi version is 1.4, not 1.5. Is there a problem with this processor?
Created 04-26-2018 03:04 PM
I also tried to "generate" the XML by the GenerateFlowFile processor, but still the same problem (thought it has something to do with my read XML maybe, but seems not to be so)
Created 09-06-2018 02:27 AM
Maybe try without the string() function around it?! I'm not sure, since I used the transform above and it worked...
Created on 04-26-2018 03:06 PM - edited 08-18-2019 02:12 AM
Here my workflow so far:
Created on 09-02-2018 05:10 AM - edited 08-18-2019 02:11 AM
Same problem here. Valid XPath expressions produce empty strings in EvaluateXPath processor. In the attached screenshots only //@categoryId and //@UniqueType seem to work. I am using nifi 1.7.1 and jdk1.8.0_181.jdk. Any insights would be appreciated!
Created on 09-28-2024 07:03 PM - edited 09-28-2024 07:04 PM
Thanks to the second OP for identifying the root cause in the NiFi Jira.
For people researching this today, the cause was the implicit/default Namespace specified in the root node (the 'xmlns' referenced in that element but without a suffix). In the case of the second poster, their XML started with:
<data xmlns="http://www.media-saturn.com/msx" xmlns: ...
The `/data/item//uniqueID` he was searching for belongs to, more accurately, the "http://www.media-saturn.com/msx" namespace, meaning that - he was supposed to - specify that namespace as part of his XPath expression.
The reason that searching for the pathless "//@uniqueType" worked, was because that search searches all namespaces for that XPath expression!
I'm using NiFi 2.0.0 M4 today and I'm pleased to report that it appears to support the XPath 3.0/3.1 notation where the Namespace can be specified inline with the query. It's not particularly elegant - but it works. You prefix the Namespace with the capital Letter 'Q' and wrap it in curly brackets; namely:
Q{http://www.media-saturn.com/msx}<single-level selector>
To implement his expression, "/data/item//uniqueID[UniqueType='ProdID']/text()" which currently returns an Empty String set for Key 'ProdID4', you would use:
/Q{http://www.media-saturn.com/msx}data/Q{http://www.media-saturn.com/msx}item//uniqueID[UniqueType='ProdID']/text()
I have a suspicion that the second Namespace reference (to 'item' in this case) is not required, since once you've selected/are navigating down the 'data' path of the correct Namespace, you're not likely to jump to another Namespace? My research indicates that Attributes do not seem to accept Namespace referencing - but again, once you've successfully selected your path I suspect it becomes a moot point.
Aside,
[1] it would be nice if the NiFi documentation specified the version of the XPath implemented within the Processor.
[2] Even better if there were a drop down within the Processor that allowed a developer to select the version of XPath expression desired.