Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Replace first instance of text ocurrence with Replace Text in Apache NiFi

avatar
Explorer

I need to process some XML files that contain some doubled field names as shown is the sample below (Latitude, Longitude and Height). I need to change the 1st ocurrence in Latitude1, Longitude1, Height1 and the 2nd in Latitude2..etc so I can do a JSON conversion after.

What would be the best aproach to change these fields?

 

 

 

<?xml version="1.0" encoding="UTF-8"?><vwConnectionLog><ID>18500380</ID><JobID>1336467</JobID><Active>0</Active><StartTimeConnection>2019-12-01 12:09:12</StartTimeConnection><StartTimeSending>2019-12-01 12:09:17</StartTimeSending><EndTimeConnection>2019-12-01 12:25:23</EndTimeConnection><FirstSiteCode>HORE</FirstSiteCode><FirstRefStatID>84</FirstRefStatID><FirstDistanceToStat>41913</FirstDistanceToStat><LastSiteCode>VLC2</LastSiteCode><LastRefStatID>29</LastRefStatID><LastDistanceToStat>35429</LastDistanceToStat><Latitude>0.7820122964688779</Latitude><Longitude>0.4227112816015234</Longitude><Height>234.8100</Height><JobGUID>f787b88a-38bc-4158-ab44-0081a52f5295</JobGUID><Time>2019-12-01 12:17:17</Time><ActRefStationID>84</ActRefStationID><ActRefStationCode>HORE</ActRefStationCode><ActNMEARefStationID>806</ActNMEARefStationID><Satellites>-1</Satellites><SatellitesUsed>22</SatellitesUsed><PositionFix>4</PositionFix><HDOP>1.7999999523162842</HDOP><Event>Rover state changed</Event><Latitude>0.7820122964688779</Latitude><Longitude>0.4227112816015234</Longitude><Height>234.81</Height><Auxiliaries/><SatellitesGPS>-1</SatellitesGPS><SatellitesGlo>-1</SatellitesGlo><FixedSatellites>-1</FixedSatellites><FixedSatellitesGPS>-1</FixedSatellitesGPS><FixedSatellitesGLO>-1</FixedSatellitesGLO><UsedSatellitesFKP>-1</UsedSatellitesFKP><UsedSatellitesFKPGps>-1</UsedSatellitesFKPGps><UsedSatellitesFKPGlo>-1</UsedSatellitesFKPGlo><SatellitesBDS>-1</SatellitesBDS><FixedSatellitesBDS>-1</FixedSatellitesBDS><UsedSatellitesFKPBds>-1</UsedSatellitesFKPBds><SatellitesGAL>-1</SatellitesGAL><FixedSatellitesGAL>-1</FixedSatellitesGAL><SatellitesQZSS>-1</SatellitesQZSS><FixedSatellitesQZSS>-1</FixedSatellitesQZSS><RoverUserName>pop15064</RoverUserName><RoverUserCompany>template</RoverUserCompany><RoverUserDetail>pop15064 pop15064</RoverUserDetail><RoverUserClientHost>14693</RoverUserClientHost><HeartbeatDisconnectTime>300</HeartbeatDisconnectTime><SubscriptionId>19202</SubscriptionId><RTProductName>RO_VRS_3.1</RTProductName><MessageType>Virtual RS RTCM 3.x (Extended)</MessageType><RTCMVersion>4</RTCMVersion><EndOfMessage>Nothing</EndOfMessage><RefStationID>-1</RefStationID><Connection>NTRIP-Client</Connection><HostName>Proxy</HostName><PortNr>2101</PortNr><NtripMntp>RO_VRS_3.1</NtripMntp><FilePath/><FilePathActive>0</FilePathActive><Authentication>Ntrip</Authentication><CellsSitesType>Automatic cells</CellsSitesType><AutoSelectCellSite>1</AutoSelectCellSite><DistanceForChanging>1000</DistanceForChanging><SatSystem>3</SatSystem><SendNullAntenna>Yes</SendNullAntenna><ErrString/><VerboseInfoString/><MaxDistProvCorr>100000</MaxDistProvCorr><FallbackDistance>3000</FallbackDistance><FallbackOnNWOff>8</FallbackOnNWOff><FallbackOnDist>16</FallbackOnDist><Fallforward>128</Fallforward><UseMaxDistProvCorr>4</UseMaxDistProvCorr><RoverCredentialName>pop15064</RoverCredentialName></vwConnectionLog>

 

 

 

0 ACCEPTED SOLUTIONS
1 REPLY 1

avatar
Explorer

Here is the process to the point I got stuck. So I have the original xmls, I do a split to separate the child xmls that are the target of my analysis but they contain duplicate fields. I didn’t find a way to use the ReplaceText to change only the 1st encountered element in order to append an ID so it may be unique. Another approach that I found was to use a Python script attached below, that did the job, but I wonder if is there way to it in one single process.

nifixml.png

import json
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class PyStreamCallback(StreamCallback):
	def __init__(self, flowfile):
		self.ff = flowfile
	pass
	def process(self, inputStream, outputStream):
		text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
		text = text.replace("<Latitude>", "<Latitude1>", 1)
		outputStream.write(bytearray(text.encode('utf-8')))

flowFile = session.get()
if (flowFile != None):
	flowFile = session.write(flowFile,PyStreamCallback(flowFile))
	session.transfer(flowFile, REL_SUCCESS)