Support Questions

Find answers, ask questions, and share your expertise

How can I remove XML Tag Name attributes in NiFi?

avatar
Contributor

Hi, 

 

is there some way that nifi could remove the attributes inside the XML tag name? Please advise with some example to help me. Appreciate. Thanks 

 

Example: 

<particular>

   <name>Nicholas</name>

   <height>183</height>

   <weight>75</weight>

</particular>

<particular>

   <name>Debbie</name>

   <height163</height>

   <weight>45</weight>

</particular>

 

to something like: 

<particular>

   <name>Nicholas</name>

   <height unit = "meter">183</height>

   <weight unit = "kilogram" unit = "lb">75</weight>

</particular>

<particular>

   <name>Debbie</name>

   <height unit = "meter">163</height>

   <weight unit = "kilogram" unit = "lb">45</weight>

</particular>

1 ACCEPTED SOLUTION

avatar
Master Mentor

@techNerd 

Based on yoru example, you can actually modify your XML using the ReplaceText [1] processor.
The processor can be configured with Java regex that would match on the string you want to remove and the replacement value would be set to an empty string.

MattWho_0-1623937282966.png

Your regex pattern in this case would need to match on (don't forget the spaces):

( unit = ".*?")

The parenthesis mark the java regex capture group.
.*? is a non greedy match on any character for zero or more characters until the very next quote.  This allows this to match on any variety of unit strings ( meter, kilogram, and lb) from your example

This will result in the following:

<particular>
   <name>Nicholas</name>
   <height unit = "meter">183</height>
   <weight unit = "kilogram" unit = "lb">75</weight>
</particular>
<particular>
   <name>Debbie</name>
   <height unit = "meter">163</height>
   <weight unit = "kilogram" unit = "lb">45</weight>
</particular>

being converted to:

<particular>
   <name>Nicholas</name>
   <height>183</height>
   <weight>75</weight>
</particular>
<particular>
   <name>Debbie</name>
   <height>163</height>
   <weight>45</weight>
</particular>

 

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apach...


If you found this addressed your query, please take a moment to login and click "Accept" on this solution.
Thank you,

Matt

View solution in original post

5 REPLIES 5

avatar
Master Mentor

@techNerd 

I am looking at your example and am not clear on what your are "removing"?
looks like you are adding text to your example.

Thanks,

Matt

avatar
Contributor

Hi @MattWho

 

Apologies as I made a mistake on the example. The example should be as follow: 

 

Input: 

<particular>

   <name>Nicholas</name>

   <height unit = "meter">183</height>

   <weight unit = "kilogram" unit = "lb">75</weight>

</particular>

<particular>

   <name>Debbie</name>

   <height unit = "meter">163</height>

   <weight unit = "kilogram" unit = "lb">45</weight>

</particular>

 

Output: 

<particular>

   <name>Nicholas</name>

   <height>183</height>

   <weight>75</weight>

</particular>

<particular>

   <name>Debbie</name>

   <height>163</height>

   <weight>45</weight>

</particular>

 

Are there some ways that NiFi processor could remove the attributes (unit or etc..) inside the XML tag name as mention in the example above? It will be a great help if there is an example with internal processor configuration shown and explanation. 

 

Appreciate your help. Thanks. 

 

avatar
Master Mentor

@techNerd 

Based on yoru example, you can actually modify your XML using the ReplaceText [1] processor.
The processor can be configured with Java regex that would match on the string you want to remove and the replacement value would be set to an empty string.

MattWho_0-1623937282966.png

Your regex pattern in this case would need to match on (don't forget the spaces):

( unit = ".*?")

The parenthesis mark the java regex capture group.
.*? is a non greedy match on any character for zero or more characters until the very next quote.  This allows this to match on any variety of unit strings ( meter, kilogram, and lb) from your example

This will result in the following:

<particular>
   <name>Nicholas</name>
   <height unit = "meter">183</height>
   <weight unit = "kilogram" unit = "lb">75</weight>
</particular>
<particular>
   <name>Debbie</name>
   <height unit = "meter">163</height>
   <weight unit = "kilogram" unit = "lb">45</weight>
</particular>

being converted to:

<particular>
   <name>Nicholas</name>
   <height>183</height>
   <weight>75</weight>
</particular>
<particular>
   <name>Debbie</name>
   <height>163</height>
   <weight>45</weight>
</particular>

 

[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apach...


If you found this addressed your query, please take a moment to login and click "Accept" on this solution.
Thank you,

Matt

avatar
Contributor

Hi @MattWho 

 

Thanks for your help. Appreciate.

avatar
New Contributor

Maybe a TransformXml with this XSLT might be more future proof:

 

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>
  <xsl:strip-space elements="*"/>
  <xsl:template match="node()">
    <xsl:copy>
      <xsl:apply-templates />
    </xsl:copy>
  </xsl:template>
</xsl:stylesheet>