Member since
03-16-2018
11
Posts
0
Kudos Received
0
Solutions
04-04-2018
08:08 PM
@Patrick There may issues external to a processors code execution that can result in a stoppage in a dataflow. The Monitor activity processor can be placed inline anywhere in your flow to monitor FlowFiles passing through it on a configurable interval. The processor then will generate alert type FlowFiles when a activity is outside of those thresholds. You could then send those generated alert messaged to a putEmail processor to send out a notification of an outage. Not all failures are unexpected, Often times a second attempt is successful. Setting up a count loop is a good way of having FlowFiles that are routed to failure to to try x number of times in a loop before taking some other action. https://cwiki.apache.org/confluence/download/attachments/57904847/Retry_Count_Loop.xml?version=1&modificationDate=1433271239000&api=v2 You may also want to consider looking at the SiteToSiteBulletinReportingTask. NiFi processor produce bulletins for ERROR and WARN level logs. With this reporting task, you can feed these produced bulletins back to same NiFi or another NiFi where they can be handled like any other FlowFile and pushed to what ever database you want. Thank you, Matt *** If you found this answer addressed your question, please take a moment to login and click "accept"
... View more
03-25-2018
05:28 AM
@Pramod Kalvala In NiFi we are having Count Text processor which will adds the number of lines,non empty lines,characters in the text file. Count text processor write Attributes:- Name Description text.line.count The number of lines of text present in the FlowFile content text.line.nonempty.count The number of lines of text (with at least one non-whitespace character) present in the original FlowFile text.word.count The number of words present in the original FlowFile text.character.count The number of characters (given the specified character encoding) present in the original FlowFile Example:- If you are having content of the flowfile as below and we are having empty line as second line in the flowfile. Once we feed this content to the Count text processor having below configs:- Count Lines true Count Non-Empty Lines true Count Words true Count Characters true Split Words on Symbols true
Output Flowfile Attributes:- count text processor has been added line.count,nonempty lines count, character count to the flowfile. (or) By using ExecuteStream command processor we can run wc -l command to get the number of lines in the text document. (or) By using query record processor to get lines in the flowfile content Useful links for Query record processor https://community.hortonworks.com/articles/140183/counting-lines-in-text-files-with-nifi.html https://community.hortonworks.com/articles/146096/counting-lines-in-text-files-with-nifi-part-2.html If you are using QueryDatabase table,execute sql processors then we will have row.count attribute associated with the output flowfile from the which will give the number of rows has been fetched from the source. To Convert Content as Flowfile Attribute:- for this use case we can use Extract text processor to extract the content and store as flowfile attribute Extract text Configs:- Add new property with the regex (.*) i.e capture all the content and keep the content as flowfile attribute name data. change the Enable DOTALL Mode to true if your flowfile content having new lines in it. Most important properties are Maximum Buffer Size 1 MB Specifies the maximum amount of data to buffer (per file) in order to apply the regular expressions. Files larger than the specified maximum will not be fully evaluated. Maximum Capture Group Length 1024 Specifies the maximum number of characters a given capture group value can have. Any characters beyond the max will be truncated. You have to increase these properties values in order of your flowfile size to get all the content of the flow file into attribute. It's not recommended to extract all the contents and keep them as attributes, as the attributes are kept in-memory. please refer to below link for nifi best practices and deeper https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#DeeperView https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#best-practice
... View more
03-18-2018
02:08 PM
@Pramod Kalvala You can schedule multiple jobs at a single given time assuming you have the resource available to cater all of them 🙂 Coming to how can you do that, there are multiple scheduling options available in NiFi. If you right-click on your "Triggering processor", that is the very first processor in your job and click on "Configure", you will see a scheduling tab. You will see an interface as presented below. There you can see "Scheduling Strategy" drop down. There you can see two major scheduling options. Timer Driven Cron Driven Timer Driven strategy executes the processor according to the duration mentioned in the "Run schedule". So in this case, the processor will run every second, triggering the next processors in the flow, provided they don't have a schedule of there own. Cron Driven is the strategy where you can mention a specific time of a day, specific day(s) etc, i.e. the schedule when you can execute that processor. Let's say that you want to run your job @1 PM every day, the Scheduling tab will look something like You can have whatever number of jobs scheduled to run at 1 PM by using the same scheduling strategy for all of them and all of them will run at the same time. All of those jobs will be separate will not interfere with each other, until you instruct them to do so and will run without any issues, provided that you have sufficient resources for all of them.
... View more
03-18-2018
04:32 PM
@ Rahul soni - Thanks so much for the detailed explanation.we have a custom scheduler which decides the number of jobs to run and this triggers the oozie API, which triggers the particular workflow like sqoop or Kafka whatever.So this workflow will go get the properties for the respective workflow and run the jobs.We are looking forward to replace the sqoop workflow with nifi.I have been doing some POC’s on nifi lately and realized that it has the capability to handle the load and was not sure how it would work in scheduling jobs like sqoop.Anyway your answer has given me a benefit of hope.
... View more