Member since
07-30-2019
3427
Posts
1632
Kudos Received
1011
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 84 | 01-27-2026 12:46 PM | |
| 486 | 01-13-2026 11:14 AM | |
| 1022 | 01-09-2026 06:58 AM | |
| 915 | 12-17-2025 05:55 AM | |
| 976 | 12-15-2025 01:29 PM |
07-11-2017
03:36 PM
@Eric Lloyd I considered that as well at first, but went the other route as I could be sure my byte sequence would be unique no matter what the stack trace looked like. Since you are looking for a line return followed by 20 you may have an issue with the very fist line in your file. I would test that to confirm. Matt
... View more
07-11-2017
03:17 PM
@Eric Lloyd Must be a by-product of the splitContent operation. It is reading the last line return before it sees the next bytes sequence. If the blank line becomes an issue, you can remove blank lines using a ReplaceText processor also. This will replace any line that starts with a line return with nothing. Thanks, Matt
... View more
07-11-2017
02:31 PM
@Eric Lloyd Another option (not as nice as the GrokReader) is to use SplitContent instead of SplitText processor. So here I use the ReplaceText processor to date string format every log line starts with and prepend to that a unique string that i can use later to split the content. I then use the SplitText processor to split based on that unique string. This means that any stack trace that follows a log line will be captured with the preceding log entry. After that you can do what you want with the resulting splits. I chose to filter out the splits for ERROR or WARN log lines and auto-terminate everything else. Here is an example output of one of my log lines with a stack trace: 2017-07-11 10:21:38,087 ERROR [Timer-Driven Process Thread-2] o.a.n.p.attributes.UpdateAttribute
java.lang.StringIndexOutOfBoundsException: String index out of range: 40
at java.lang.String.substring(String.java:1963) ~[na:1.8.0_77]
at org.apache.nifi.attribute.expression.language.evaluation.functions.SubstringEvaluator.evaluate(SubstringEvaluator.java:55) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.attribute.expression.language.Query.evaluate(Query.java:570) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.attribute.expression.language.Query.evaluateExpression(Query.java:388) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.attribute.expression.language.StandardPreparedQuery.evaluateExpressions(StandardPreparedQuery.java:48) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.attribute.expression.language.StandardPropertyValue.evaluateAttributeExpressions(StandardPropertyValue.java:152) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.attribute.expression.language.StandardPropertyValue.evaluateAttributeExpressions(StandardPropertyValue.java:133) ~[nifi-expression-language-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.processors.attributes.UpdateAttribute.executeActions(UpdateAttribute.java:496) ~[nifi-update-attribute-processor-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.processors.attributes.UpdateAttribute.onTrigger(UpdateAttribute.java:377) ~[nifi-update-attribute-processor-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) [nifi-framework-core-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) [nifi-framework-core-1.1.0.2.1.4.0-5.jar:1.1.0.2.1.4.0-5]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_77]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] Thanks, Matt
... View more
07-11-2017
02:03 PM
@adrian white At 90 MB, I suspect that CSV file has a lot of lines to split. Are you seeing any Out Of Memory errors in your nifi-app.log? To help reduce the heap usage here, you may want to try using two splitText processor in series. The first splitting every 1,000 - 10,000 lines and the second then splitting those by every line. NiFi FlowFile attributes are kept in heap memory space. NiFi has a mechanism for swapping FlowFile attributes to disk for queues, but this mechanism does not apply to processors. The SplitText processor holds the FlowFile attributes for every new FlowFile it is creating in heap until all resulting Split FlowFiles have been created. When splitting creates a huge number of resulting FlowFiles in a single transaction, you can run out of heap space. So by splitting the job between multiple splitText processors in series, you reduce the number of FlowFiles that are being generated per transaction thus decreasing heap usage. Thanks, Matt
... View more
07-10-2017
06:12 PM
Just to add more detail to the above answer...
- Granting users the ability to run provenance queries does to then give users the ability to view details on every piece of data that passes through any processor component on the canvas. - if you were to monitor the nifi-app.log on each of your nodes, you would likely see that the provence query is returning events yet none are being displayed. This is because NiFi filters the result based on "data" resource policies granted to that user. - Only results for components which the user has been granted access will be displayed. This is where the /data/{resource}/{uuid} mentioned above comes in to play here.
... View more
07-07-2017
08:22 PM
What is the output of the following:
netstat -ant |grep LISTEN
... View more
07-07-2017
05:34 PM
@Sanaz Janbakhsh Good to hear, can you mark the original answer I posted as accepted to close out this thread? Thanks, Matt
... View more
07-07-2017
05:32 PM
@Sanaz Janbakhsh We should try to avoid creating a new "Answer" for every correspondence here. I am not clear on what you mean by "blank page"? Have you tried clearing your browser cache? What do you see in your NiFi's nifi-user.log when you try to access the https web address for your iFi instance?
https://<nifinodename><secureport>/nifi
Thanks, Matt
... View more
07-07-2017
04:03 PM
@Sanaz Janbakhsh The users.xml and authorizations.xml files are generated on initial startup of a secured NiFi instance using the configurations specified in the authorizers.xml file. Once these two files exist, any changes made in the authorizers.xml file will not be made to these existing files. The expectation is that the NiFi UI is used at that point to add additional users and set additional authorizations. So if the initial authorizers.xml file had incorrect entries, the users.xml and authorizations.xml files created will not be correct. You will need to remove these two files and restart so that new users.xml and authorizations.xml files are created based on a correct configuration in the authorizers.xml. The users.xml and authorizations.xml files outputs you shared above are not correct. Neither is your authorizers.xml. Your authorizers.xml file should look something like this:
<authorizers>
<authorizer>
<identifier>file-provider</identifier>
<class>org.apache.nifi.authorization.FileAuthorizer</class>
<property name="Authorizations File">/var/lib/nifi/conf/authorizations.xml</property>
<property name="Users File">/var/lib/nifi/conf/users.xml</property>
<property name="Initial Admin Identity">CN=admin, OU=NIFI</property>
<property name="Legacy Authorized Users File"></property>
<property name="Node Identity 1">CN=nifinode1, OU=NIFI</property>
<property name="Node Identity 2">CN=nifinode2, OU=NIFI</property>
<property name="Node Identity 3">CN=nifinode3, OU=NIFI</property>
</authorizer>
</authorizers> Each node in your cluster must have its own entry.
You must specify an Initial Admin Identity. This will be the only user who can access your NiFi initially. The will given the authorizations needed to add additional users and assign policies for those new users. Using above example, your users.xml file that is generated should look like this: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tenants>
<groups/>
<users>
<user identifier="38e35829-435d-3be4-83b6-784cb560e855" identity="CN=admin, OU=NIFI"/>
<user identifier="22f1b808-a02d-3344-93c1-c944af6b5686" identity="CN=nifinode1, OU=NIFI"/>
<user identifier="ea71911e-b2f3-3975-a459-50c9f8e905d1" identity="CN=nifinode2, OU=NIFI"/>
<user identifier="e63552bb-6e32-346d-8b9d-d82ef1616ce9" identity="CN=nifinode3, OU=NIFI"/>
</users>
</tenants> And your authorizations.xml that is generated should look like this: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<authorizations>
<policies>
<policy identifier="ba421219-28f1-3918-bc27-bf5533cb847e" resource="/flow" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="b56e3b5c-a458-3088-a4a6-30c9ad7ea69d" resource="/data/process-groups/f459ab3e-015c-1000-6a96-d0fd4c9da94c" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
<user identifier="22f1b808-a02d-3344-93c1-c944af6b5686"/>
<user identifier="ea71911e-b2f3-3975-a459-50c9f8e905d1"/>
<user identifier="e63552bb-6e32-346d-8b9d-d82ef1616ce9"/>
</policy>
<policy identifier="78c6edfa-7c8a-398e-8ffa-716820b5040b" resource="/data/process-groups/f459ab3e-015c-1000-6a96-d0fd4c9da94c" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
<user identifier="22f1b808-a02d-3344-93c1-c944af6b5686"/>
<user identifier="ea71911e-b2f3-3975-a459-50c9f8e905d1"/>
<user identifier="e63552bb-6e32-346d-8b9d-d82ef1616ce9"/>
</policy>
<policy identifier="b817348f-f27b-3b42-8b8c-040977436b45" resource="/process-groups/f459ab3e-015c-1000-6a96-d0fd4c9da94c" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="dd8ad42a-4266-3646-a804-f612245edbe3" resource="/process-groups/f459ab3e-015c-1000-6a96-d0fd4c9da94c" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="efd76cc8-fd81-3cd1-bf21-3065661848bd" resource="/restricted-components" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="c2f680ff-bec3-336b-8ed2-512321cc7162" resource="/tenants" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="d3840ff8-f56e-3d2c-8361-bab5cf498107" resource="/tenants" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="ff398473-528d-3393-85bc-cd6810f47d72" resource="/policies" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="a55e48e9-691f-3052-ae92-77fffb2858d6" resource="/policies" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="56f51845-8783-3a14-b22c-9971bf232b17" resource="/controller" action="R">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="ef41b898-79b8-3782-b01a-e54e5bf20661" resource="/controller" action="W">
<user identifier="38e35829-435d-3be4-83b6-784cb560e855"/>
</policy>
<policy identifier="19b83f2b-967e-35d5-8091-f4abc877877b" resource="/proxy" action="W">
<user identifier="22f1b808-a02d-3344-93c1-c944af6b5686"/>
<user identifier="ea71911e-b2f3-3975-a459-50c9f8e905d1"/>
<user identifier="e63552bb-6e32-346d-8b9d-d82ef1616ce9"/>
</policy>
</policies>
</authorizations> Of course all the UUIDs that are generated will be different. Thanks, Matt *** If you found this answer addressed your question please mark it as accepted.
... View more
07-07-2017
01:27 PM
@adrian white The tailFile processor is designed to tail a file and ingest new lines as they are written to that file. IN your case you have static files that are not being written to. You will want to use the GetFile processor to in gets these complete files before using the splitText processor to break them apart in 1 line per new FlowFile. Thanks,
Matt
... View more