Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11301 | 04-15-2020 05:01 PM | |
| 7195 | 10-15-2019 08:12 PM | |
| 3171 | 10-12-2019 08:29 PM | |
| 11673 | 09-21-2019 10:04 AM | |
| 4407 | 09-19-2019 07:11 AM |
03-09-2018
12:33 PM
1 Kudo
@Pavan M You can use MergeContent processor before PutEmail processor and configure maximum group size to 100 then processor will wait until the maximum group size then it will keep give only one flowfile after merging the 100 flowfiles. Then use Merged relationship feed to PutEmail processor in this case we are going to wait until 100Flowfiles are reached to destination then merge content merges all the flowfiles into one flowfile and put email processor will receive only one flowfile,You are going to have one email notification instead of 100. if you are not sure about number of flowfiles then make use of Max Bin Age Property, this property will force to merge all the flowfiles, you need to keep maximum number of entries property,Maximum Group Size according to your flowfiles and total size of all flowfiles. Flow:- 1.PutSQL Processor
2.MergeContent
3.PutEmail Please refer to below link for configuring merge content processor https://community.hortonworks.com/questions/149047/nifi-how-to-handle-with-mergecontent-processor.html https://community.hortonworks.com/questions/64337/apache-nifi-merge-content.html (or) If you want only one email when first flowfile reaches into destination instead of waiting for mergecontent processor to merge all 100 flowfiles. Then use control rate processor change the configurations Rate Control Criteria flowfile count
Maximum Rate 1
Rate Controlled Attribute
No value set
Time Duration 10 min //change the number according to your putsql processor time taken to write records to destination. so now we are releasing one flowfile from Control Rate processor for every 10 minutes. Then the queue configuration before control rate processor change the Flowfile expiration to some small number like 10 Sec. With this configurations we are going to release first flowfile from control rate processor and then first flowfile reaches to PutEmail processor, we will get one notification. All the rest 99 flowfiles are waiting in the queue before Control Rate Processor and they are going to expire in 10 sec after they reached in to the queue. By following this approach you can get notification from first flowfile once it reaches to destination Flow:- 1.PutSQL processor
2.Control Rate
3.PutEmail
... View more
03-08-2018
09:28 PM
@User 805 Awesome,thankyou very much for sharing the flow with us 🙂
... View more
03-08-2018
12:16 PM
1 Kudo
@Sami Ahmad Output of QueryDatabaseTable processor are always in avro format , so you need to use PutHiveStreaming processor after Querydatabasetable processor. As PutHiveStreaming processor expects incoming data to be in avro format and we are getting incoming data from querydatabasetable in avro format. Flow:- 1.QueryDatabasetable 2.PutHiveStreaming 3.LogAttribute please refer to below links regarding table creation of puthivestreaming processor https://community.hortonworks.com/questions/59411/how-to-use-puthivestreaming.html https://community.hortonworks.com/articles/52856/stream-data-into-hive-like-a-king-using-nifi.html
... View more
03-08-2018
12:05 PM
1 Kudo
@User 805 As you are using NiFi 1.5 you can use Avroreader as RecordReader and CsvRecordSetWriter as Record Writer in ConvertRecord processor. ConvertRecord processor RecordReader-->AvroReader//reads the incoming avro format flowfile contents
RecordWriter-->CsvRecordSetWriter//write the output results in csv format Flow:- ListDatabaseTables GenerateTableFetch UpdateAttribute(are you changing the filename to UUID?) ConvertRecord PutS3Object Please refer to below link to configure AvroReader https://community.hortonworks.com/questions/175208/how-to-store-the-output-of-a-query-to-one-text-fil.html?childToView=174434#comment-174434 If you are still facing issues then share us sample of 10 records in csv (or) json format to recreate your scenario on our side.
... View more
03-07-2018
08:17 PM
@Margarita Uk Use ReplaceText after ListFTP processor and before MergeContent processor with replacing the filename as the contents of the flowfile. Replace text Processor Configs:- So we are keeping the filename as the contents of the flowfile with above configs. Then use MergeContent Processor with Below Config:- Configure the merge content processor as per your requirements and change the delimiter strategy as text and Demarcator with , and new line. Output:- 940630588913985, <br>940634934689001 As i'm having 2 files from generateflowfile processor then i did replacetext and changed the contents of flowfile as the flowfile name in it. After mergecontent processor we are having 2 flowfilenames with , and newline as demarcators. For comparing store the merged file having filenames in it into Hive and then get the distinct filenames that are loaded into hive table. Then compare both filenames by getting the results if the filename is presented in first hivetable(merged filenames table) and not in your actual hivetable that you loaded the data into.
... View more
03-07-2018
12:15 PM
2 Kudos
@User 805 You need to autoterminate original relationship in ConvertJSONtoSQL processor, now original relationship is loop back into same processor again which is causing all the duplicates in mysql database. How to Autoterminate original relationship:- clear the queue first. delete the original relationship right click on processor goto setting tab click on checkbox at original relationship as shown in the below screenshot . If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
... View more
03-06-2018
04:13 AM
1 Kudo
@Manikandan Jeyabal For this case we can use UpdateAttribute processor to add some sequence numbers to the flowfiles then use RouteonAttribute processor to route the flowfiles to one of four processors based on the sequence number. UpateAttribute Configs:- seq
${getStateValue('seq'):plus(1)} so we are adding new property in update attribute processor and get the state value of seq attribute if the value is not present then it gets value as 0 and we are adding 1 to it. So when first flowfile passes througth the update attribute processor then it get seq attribute with value 1,second flowfile gets 2 value for seq attribute. Now once the processor reaches to 4 as seq attribute value, then we need to reset that value as 1 again(i.e for 5 flowfile the seq value would be 1 again) To acheive this we need to use updateattribute processor advance usage and create new rule overthere as Right Click on UpdateAttribute processor Click on Advanced then add new rule in the processor as shown in the below screenshot. Rules reset Conditions RuleName reset Expression
${getStateValue('seq'):equals(4)} Actions
Attribute seq Attribute
Value
1 So now the processor will reset the seq attribute value to 1 for every fifth flowfile. Then Use RouteOnattribute processor:- So now we are checking the seq attribute value and transfer the first flowfile to first processor and second to second processors..
... View more
03-05-2018
01:53 PM
2 Kudos
@Manikandan Jeyabal You can use RouteonAttribute with fileSize attribute and check if the size of flowfile is greater than 1GB.
Example:- 1.filesize > 1gig ${fileSize:gt(1048576000)} in the above routeonattribute processor i have added new property which can detect and transfer flowfiles size if the size is greater than 1GB. we need to keep the size of file in bytes. 2.filesize >100MB and <500MB ${fileSize:gt(1048576):and(${fileSize:lt(524288000) second property will route the flowfiles which are having sizes greater than 100MB and less than 500 MB.
Like this way you can add new properties in Routeonattribute and transfer the flowfiles to corresponding convert record processors based on Filesizes. **1 megabyte (1048576 bytes)** **1 Gigabyte = 1000 Megabytes** As i'm using gt,lt(greaterthan,lessthan) based on your requirements you can use ge,le(greaterthan or equalto,lessthan or equalto) please refer to below link for expression language usage. https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#lt
... View more
03-04-2018
10:40 PM
@Gaurang Shah You can schedule Monitor Activity processor by using Cron Schedule. Schedule Monitor Activity processor after 2-3 mins than Query database table processor (or) this scheduling depends on how much time QueryDatabase table processor is taking to load incremental data. If my Query database table processor is completing to load with in 1 min then you need to schedule Monitor Activity 1 min after Query database table processor and if your threshold duration is 1 minute then you need to run Monitor Activity continue 2 mins in row, first minute is for watching and second minute to trigger message. Example:- QueryDatabaseTable processor scheduled at 2:00AM daily and Monitor Activity is scheduled at 2:01AM daily and you are running threshold duration of 1 min. Then monitor activity configs would be I'm running monitor activity 2 minutes i.e 2:01AM(to watch for file) 2:02AM(to trigger message) because my threshold duration is 1 min, if your threshold duration is 5 minutes then you need to run processor for 6 mins(5mins to watch +1 min to trigger message) Change threshold duration and cron schedule as per your needs.
... View more
03-04-2018
07:47 PM
1 Kudo
@Gaurang Shah For this case you can use Monitor Activity processor by forking Querydatabase table processors success relation, if you got the load then create a file on a server, if load is not present then monitor Activity processor will create the file that you want to create on the server. Example:- As you can see in the above flow i have forked the success relation from Querydatabase table processor to both UpdateAttribute and MonitorActivity processors. So if the load is present from QueryDatabasetable processor then you are able to create file on server, if load is not present then Monitor Activity processor needs to configure as how much time it needs to look for flowfile and determines the flow not having load. Configs:- As my monitor activity processor will wait for 1 min if no flowfiles are feed to the processor then it sends inactivity message with flowfile content as Lacking activity as of time: 2018/03/04 14:25:06; flow has been inactive for 1 minutes So once you got this inactive flowfile from Monitor activity processor then feed inactive relationship to your create file on the server processor. if you want to create file on the server with specific name then by using UpdateAttribute you can change the name of the file. As you can change the configs of Monitor Activity processors properties Threshold duration how long you want to wait and determine the flow is inactive. Now we are creating a file even if we haven't got any load from query database table processor. Monitor Activity reference:- https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.5.0/org.apache.nifi.processors.standard.MonitorActivity/index.html
... View more