- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Naming splitted files incrementally in nifi for a particular table and then reset for another table
- Labels:
-
Apache NiFi
Created ‎06-06-2018 06:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am doing the following stuff in nifi : Fetching data from tables in hive and then routing the flow files based on size : If flowfile size is gt 2GB then split the flow file to multiple flow files of 2Gb each. I want to use update attribute to name those splits like TableName_001_001,Tablename_001_002,Tablename_001_003 for a particular flow file or table .
When next flow file comes in the split it should also be named like above .
Is there any way we can do with the existing processor ?
Created ‎06-06-2018 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are using any other processors Except SplitRecord processor for splitting the flowfile into smaller chunks then each flowfile will have fragment.index attribute associated with the flowfile.
As you are having table name as attribute to the flowfile and Make use of these attributes (table_name and fragment.index) and combine them to one to Create the new required attribute
I'm assuming the tab_name is the table name attribute and Add new property in update attribute processor and
In addition if you want to keep this attributes unique then you can add the timestamp value at the end like
new_attribute
${tab_name}_${fragment.index}_${now():toNumber()}
Based on the fragment.index and tab_name attribute values we are creating new attribute value dynamically.
Created ‎06-06-2018 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are using any other processors Except SplitRecord processor for splitting the flowfile into smaller chunks then each flowfile will have fragment.index attribute associated with the flowfile.
As you are having table name as attribute to the flowfile and Make use of these attributes (table_name and fragment.index) and combine them to one to Create the new required attribute
I'm assuming the tab_name is the table name attribute and Add new property in update attribute processor and
In addition if you want to keep this attributes unique then you can add the timestamp value at the end like
new_attribute
${tab_name}_${fragment.index}_${now():toNumber()}
Based on the fragment.index and tab_name attribute values we are creating new attribute value dynamically.
Created ‎06-07-2018 04:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks ! I also did manage to do by the same way !
But there is one thing which needs to be tweaked . Since I am using selectHiveQl Processor for fetching the data from hive and then routing the files to split processor if file size is greater than 2 GB . For all the unmatched files i.e files with size less than 2 GB will not have fragment.index attribute associated with them as they are passed through split text processor .
So You need to add the suffix tableName_001 using the update attribute processor to the table name attribute as it will not take the ${fragment.index} as it is null .
Created on ‎06-07-2018 10:52 AM - edited ‎08-17-2019 08:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For this case use IfElse function to check is the value exist for fragment.index attribute if exists then then use the same value of fragement.index if not exist(i.e filesize is less than 2GB) keep your default value 1.
UpdateAttributeConfigs:
new_attribute
${tab_name}_${fragment.index:isEmpty():ifElse('1','${fragment.index}')}
Refer to this link for more details regarding ifelse function of NiFi expression language.
Created ‎06-07-2018 11:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Appreciate your response !
I did the same using the Advanced property section of UpdateAttribute to create rules and actions as i have to append leading 0's since I need the fragment index appended in three digit like 001,020,100 ...
I have created three rules to check if the fragment index is less than 10 then make fragment.index = 00${fragment.index}
if it is more than 9 and less than 100 then fragment.index = 0${fragment.index}
else fragment.index = ${fragment.index}
Created ‎06-07-2018 11:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great, Good to know that..!!
Other way of doing by checking the length of fragment.index attribute then using nested ifelse statements to determine the prepend by 00,0. but the expression will become complex using Advanced property will be good approach.
If the Answer addressed your question, Take a moment to Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
