Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Replace XML tag from nested xml in Apache Nifi

Explorer

I have XML Data like this:

 

<?xml version="1.0" encoding="UTF-8"?>
<XYZ>
<ID>1</ID>
<Name>abc</Name>
<JKL>
<ID>10</ID>
<Type>A</Type>
</JKL>
<JKL>
<ID>11</ID>
<Type>B</Type>
</JKL>
</XYZ>
<XYZ>
<ID>2</ID>
<Name>def</Name>
<JKL>
<ID>10</ID>
<Type>A</Type>
</JKL>
<JKL>
<ID>11</ID>
<Type>B</Type>
</JKL>
</XYZ>

 

I want to replace tag of outer ID tag from the nested XML like:

 

<?xml version="1.0" encoding="UTF-8"?>
<XYZ>
<XYZID>1</XYZID>
<XYZName>abc</XYZName>
<JKL>
<ID>10</ID>
<Type>A</Type>
</JKL>
<JKL>
<ID>11</ID>
<Type>B</Type>
</JKL>
</XYZ>
<XYZ>
<XYZID>2</XYZID>
<XYZName>def</XYZName>
<JKL>
<ID>10</ID>
<Type>A</Type>
</JKL>
<JKL>
<ID>11</ID>
<Type>B</Type>
</JKL>
</XYZ>

 

I have used replace tag using expression language(regex) but it changes all the names 'ID' in the xml to 'XYZID' but I only want to change the outside tag name

 

Any suggestion??

 

Any help in this issue will be greatly appreciated.

Thank You!

5 REPLIES 5

Master Guru

@CodeLa 

You can accomplish this via the ReplaceText processor using a multi line approach to your Java Regular Expression (regex).  

MattWho_0-1633373052071.png

Search Value:

MattWho_1-1633373078054.png

Replacement Value:

MattWho_3-1633373139498.png


The downside to this approach is that you need to configure this processor with an Evaluation Mode of "Entire text", Evaluation Mode of "All", and make sure the configured buffer size is large enough to fit the entire text.  This in turn means a higher heap memory utilization when this processor is executing against your FlowFile.

 

If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt



Explorer

Hi Matt,

 

Thanks for the reply but its not changing anything in my current scenario can you please help? I have xml file which i split it and then use replacetext and follow your solution but it didn't help

Master Guru

@CodeLa 

I setup a dataflow using the exact example you shared:

MattWho_0-1633461237755.png


After the ReplaceText, I see the content is now:

MattWho_1-1633461331773.png

 

Can you share your sample xml file that is not working?


How is your split being done?

Thanks,

Matt

Explorer

Hi Matt,

 

Thanks for the solution may be the issue is something else. I have multiple same records which is nested inside another header which i've first spitted and then use replacetext.

Master Guru

@CodeLa

It is difficult for me to help determine issue in your dataflows without your complete dataflow.
My guess is somewhere in the process of splitting your xml files the structure has change in such a way that the Java regex I provided no longer matches.