Support Questions

Find answers, ask questions, and share your expertise

loop xml attributes

avatar
Contributor

We have the below xml configuration as input to get the to-path value to put files.

<configuration verbose="false" debugMode="false">
	<dataFlows> 		
		<dataFlow>
			<properties> 				
				<dept>salary</dept>
				<version>1.0</version>			
			</properties>
			<filePattern>salary_*.gz</filePattern> 	
			<to>				
				<path>d_${dept}/${version}/csv</path> 			
			</to> 		
		</dataFlow> 
		<dataFlow>
			<properties> 				
				<dept>pension</dept>
				<version>2.2</version>			
			</properties>
			<filePattern>pension_*.gz</filePattern> 	
			<to>				
				<path>d_${dept}/${version}/csv</path> 			
			</to> 		
		</dataFlow> 	
	</dataFlows> 
</configuration>

I have followed the nifi-lookupattribute-and-updateattributes to handle a single <dataFlow>. Thanks to @jfrazee.

Does anyone know how to loop attributes? I will be having 100s of <dataFlow> attributes in a single xml config file. I have added just 2 in the above example.

1 ACCEPTED SOLUTION

avatar

Hi @Pavan Challa

If I understand your use case correctly, I think I have come up with a groovy script to do the job.

It loops through dataFlow elements, test if filePattern matches, then resolve path with ExpressionLanguage. Please check this Gist if it works for you:

https://gist.github.com/ijokarumawak/a4ef40b49b45cecf3c43b56493683725

I had to change filePattern to be Regular Expression

<filePattern>salary_*.gz</filePattern>
/* Added a dot before the star */
<filePattern>salary_.*.gz</filePattern>

Hope this helps.

View solution in original post

4 REPLIES 4

avatar

Hello @Pavan Challa

Probably SplitXml processor will be helpful. Specify depth '2' and you'll get FlowFiles having only single 'dataFlow' element as its content.

avatar
Contributor

@kkawamura, Thanks for your response. In my case, the FlowFiles are not xml files. It will be txt or gz files. So I will be using LookupAttributes (which will be using XMLFileLookupService) to read the XML config file (shown above) for to-path to put FlowFiles (*.txt or *.gz) into destination path (to-path: ex: /d_salary/1.0/csv or /d_pension/2.2/csv). So I need to know the way to loop my XML config file.

I have missed add one more attribute in XML config file. I will be having something like below for each dataFlow.

<filePattern>salary_*.gz</filePattern>

<filePattern>pension_*.gz</filePattern>

I will correct my above XML file. So if I have salary_*.gz, the file should be moved to /d_salary/1.0/csv. If I have pension_*.gz then it should be moved to /d_pension/2.2/csv.

Any help is much appreciated.

avatar

Excuse me @Pavan Challa, I should have looked at the related question more carefully.

So, what you'd like to do is looping through 'dataFlow' elements to find one which has 'filePattern' that matches with the name of incoming file? If so, that might be too much to do with XMLFileLookupService. I'd write a script with ExecuteScript that parses the XML file and do the matching.

avatar

Hi @Pavan Challa

If I understand your use case correctly, I think I have come up with a groovy script to do the job.

It loops through dataFlow elements, test if filePattern matches, then resolve path with ExpressionLanguage. Please check this Gist if it works for you:

https://gist.github.com/ijokarumawak/a4ef40b49b45cecf3c43b56493683725

I had to change filePattern to be Regular Expression

<filePattern>salary_*.gz</filePattern>
/* Added a dot before the star */
<filePattern>salary_.*.gz</filePattern>

Hope this helps.