Support Questions

Find answers, ask questions, and share your expertise

Naming splitted files incrementally in nifi for a particular table and then reset for another table

avatar
Rising Star

I am doing the following stuff in nifi : Fetching data from tables in hive and then routing the flow files based on size : If flowfile size is gt 2GB then split the flow file to multiple flow files of 2Gb each. I want to use update attribute to name those splits like TableName_001_001,Tablename_001_002,Tablename_001_003 for a particular flow file or table .

When next flow file comes in the split it should also be named like above .

Is there any way we can do with the existing processor ?

1 ACCEPTED SOLUTION

avatar
Master Guru
@aman mittal

If you are using any other processors Except SplitRecord processor for splitting the flowfile into smaller chunks then each flowfile will have fragment.index attribute associated with the flowfile.

As you are having table name as attribute to the flowfile and Make use of these attributes (table_name and fragment.index) and combine them to one to Create the new required attributeenter image description here

I'm assuming the tab_name is the table name attribute and Add new property in update attribute processor and

In addition if you want to keep this attributes unique then you can add the timestamp value at the end like

new_attribute

${tab_name}_${fragment.index}_${now():toNumber()}

Based on the fragment.index and tab_name attribute values we are creating new attribute value dynamically.

View solution in original post

5 REPLIES 5

avatar
Master Guru
@aman mittal

If you are using any other processors Except SplitRecord processor for splitting the flowfile into smaller chunks then each flowfile will have fragment.index attribute associated with the flowfile.

As you are having table name as attribute to the flowfile and Make use of these attributes (table_name and fragment.index) and combine them to one to Create the new required attributeenter image description here

I'm assuming the tab_name is the table name attribute and Add new property in update attribute processor and

In addition if you want to keep this attributes unique then you can add the timestamp value at the end like

new_attribute

${tab_name}_${fragment.index}_${now():toNumber()}

Based on the fragment.index and tab_name attribute values we are creating new attribute value dynamically.

avatar
Rising Star

Thanks ! I also did manage to do by the same way !

But there is one thing which needs to be tweaked . Since I am using selectHiveQl Processor for fetching the data from hive and then routing the files to split processor if file size is greater than 2 GB . For all the unmatched files i.e files with size less than 2 GB will not have fragment.index attribute associated with them as they are passed through split text processor .

So You need to add the suffix tableName_001 using the update attribute processor to the table name attribute as it will not take the ${fragment.index} as it is null .

avatar
Master Guru
@aman mittal

For this case use IfElse function to check is the value exist for fragment.index attribute if exists then then use the same value of fragement.index if not exist(i.e filesize is less than 2GB) keep your default value 1.

UpdateAttributeConfigs:

77608-updateattr-ifelse.png

new_attribute

${tab_name}_${fragment.index:isEmpty():ifElse('1','${fragment.index}')}

Refer to this link for more details regarding ifelse function of NiFi expression language.

avatar
Rising Star

Appreciate your response !

I did the same using the Advanced property section of UpdateAttribute to create rules and actions as i have to append leading 0's since I need the fragment index appended in three digit like 001,020,100 ...

I have created three rules to check if the fragment index is less than 10 then make fragment.index = 00${fragment.index}

if it is more than 9 and less than 100 then fragment.index = 0${fragment.index}

else fragment.index = ${fragment.index}

avatar
Master Guru
@aman mittal

Great, Good to know that..!!

Other way of doing by checking the length of fragment.index attribute then using nested ifelse statements to determine the prepend by 00,0. but the expression will become complex using Advanced property will be good approach.

If the Answer addressed your question, Take a moment to Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.