Member since
07-30-2019
1975
Posts
1173
Kudos Received
545
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
126 | 04-12-2021 06:35 AM | |
93 | 04-12-2021 06:16 AM | |
72 | 04-12-2021 05:49 AM | |
242 | 03-29-2021 11:35 AM | |
315 | 03-24-2021 06:35 AM |
01-13-2021
08:45 AM
1 Kudo
@Fierymech RouteText does not modify the content of the lines. It only routes lines to different produced new FlowFiles. The content of those lines remains unchanged. The RouteText processor also does nothing with capture groups, so the entire regex is going to be evaluated against each line. I took the entire content from your "https://regex101.com/r/pdo6Ca/1" and the entire new regex from same and ran this flow. I produced only one source FlowFile as you can see processor in = 1 You can see it routed "out" three FlowFiles. One to the connection with the "matched" relationship which contains only one line since only one line matched the entire regex. (regex101 is taking in to account your capture groups) If you change: And run same test again, you will see a few more lines match (those with the additional "yes yes ..." in the lines). I attached template i used which you can import to your NiFi. Community does not support .xml files so i changed extension to .txt. You will need to change extension back to .xml before you can import the template in to your NiFi. Hope this helps.
... View more
01-13-2021
06:04 AM
1 Kudo
@Fierymech It may be helpful if you shared your RouteText processor configuration. Correct me if I am wrong, but you are looking to have all lines (minus the header lines) placed in a new FlowFile by themselves. Using you example data and the regex you provided. wwwwww aa cc
# Name foo Since ddd/www dddd
-- --------- ----- --------------- --- --- ---------
0 abc-lr1-0 35189 20-Dec 03:43:54
1 abc-rr2-g 35209 20-Dec 03:43:54
* 2 abc-rr1-0 35185 20-Dec 03:43:54
* 15 abc-lr2-0 34686 20-Dec 03:43:54
16 abc-lr1-0 34631 20-Dec 03:43:54 The above would result in a FlowFile with only lines 0,1, and 16. The header plus lines 2 and 15 would route to unmatched because of the leading "*" which does not match your regex. Result would be a FlowFile with: 0 abc-lr1-0 35189 20-Dec 03:43:54
1 abc-rr2-g 35209 20-Dec 03:43:54
16 abc-lr1-0 34631 20-Dec 03:43:54 Couple things to check if what you are seeing is entire original FlowFile getting routed to the "Original" and "Unmatched" relationships: 1. RouteText processor configuration. If i understand your use case correctly, it should be configured like this: 2. I noticed you sample data has leading and trailing whitespace so make sure processor is configured to ignore those. 3. Since you intent is produce a new FlowFile with only the lines matching the regex, make sure you set the above Routing Strategy. 4. Make sure the correct matching strategy is selected. Should be what I have above. 5. Click on the "+" to add a new dynamic property for your regex, The property name becomes a new relationship on the processor where your matching lines will be routed. 6. Since you are evaluating the source FlowFile content line-by-line, make sure your regex does not have a line return at the end of it. Correct: Incorrect (notice the line 2 which indicates a line return at end of regex): When I ran a little test flow using your sample data and regex, I got the desired results: The "lines" relationship has one new FlowFile with content of only the 3 matching lines The "unmatched" relationship contains a new FlowFile with content containing all the unmatched lines. The "original" relationship contains the original FlowFile that was processed by this processor. If you don't care about the original or unmatched FlowFiles, you can simply auto-terminate those relationships instead of routing them out of the processor in connections as I did above. Hope this helps, Matt
... View more
01-12-2021
05:49 AM
@Gcima009 It might be helpful to have more context around your issue. What action and/or NiFi component are you using when the exception occurs/ What NiFi version are you using? Can you share the entire error log from the nifi-app.log? Thanks, Matt
... View more
01-12-2021
05:34 AM
@Raj123 I am not a java developer, but NiFi is written in Java and the source code is open sourced. You would need to look at the code for the CSVReader to see how it handles AVRO schema inference. Sorry that I cannot be of more help in this specific query.
... View more
01-12-2021
05:21 AM
@CristoE Since this question already has an accepted solution and is specific to DISTCP replacement for HDFS, It would be much better to start an entirely new question in the community. You can always add a link to this question solution as reference in your new question post. You would get more visibility that way and we would not dilute the answer to this question with suggestions related to ADLS rather then HDFS.
... View more
01-08-2021
01:12 PM
@Raj123 NiFi offers many "record" based processors that support various record readers and writers. Those record readers have the ability of inferring an avro schema from the incoming record and the record writer can be configured to write the inferred schema to an attribute on the outgoing FlowFile. There is no specific infer schema processor for CSV source data. That would require a custom processor (perhaps one that utilizes the existing CSVReader controller service. Typically you would use a record based processor to manipulate, split, validate your record, so I am not the value or use case fro only wanting to infer the avro schema. That being said, you can get that inferred schema for example by simply using the "ConvertRecord" processor with a "CSVReader" (configured to infer schema) and a "CSVRecordSetWriter" (configured to "set avro.schema' attribute"). The written FlowFile will be same as source FlowFile but it will have an additional "avro.schema" attribute on the FlowFile containing the inferred avro schema. ConvertRecord: CSVReader: CSVRecordSetWriter: Hope this helps, Matt
... View more
01-05-2021
11:19 AM
1 Kudo
@garoosy You should look in to using the " ExecuteSQLRecord" instead of "ExecuteSQL" for large volume data. To be efficient here you would have many records in a single FlowFile. Right now you have a single record per each FlowFile which is not going to be very efficient. The only way for "ExecuteSQL" to handle multiple FlowFile executions in a single connection is if the SQL statement used in every FlowFile is identical. In order to do that the unique values would need to come from FlowFile attributes. You may find these post helpful: https://community.cloudera.com/t5/Support-Questions/Nifi-ExectueSQL-how-to-force-a-parameter-to-be-a-string/td-p/240117 https://stackoverflow.com/questions/63330790/using-nifi-executesqlrecord-with-parameterized-sql-statements If you have threads that never seem to complete (will see small number in upper right corner of processor (2)), it is best to get a series of thread dumps (4 - 6) to verify thread is not progressing. Then you have to determine if what the thread is waiting on. Did you try setting a "Max Wait Time" on the processor? It defaults to 0 which means it would wait forever. Hope this helps, Matt
... View more
01-05-2021
10:49 AM
@kiranps11 Did you add and start a " DistributedMapCacheServer" controller service running on port 4557? The " DistributedMapCacheClientService" controller service only creates a client that is used to connect to a server you must also create. Keep in mind that the DistributedMapCacheServer does not offer High Availability (HA). Enabling this controller services will start a DistributedMapCacheServer on each node in your NiFi cluster, but each of those servers do not talk to each other. This is important to understand since you have configured your DMC Client to use localhost. This means that each node in your cluster would be using its own DMC server rather than a single DMC server. For a HA solution you should be using an external map cache via one of the other client offerings like "HBase_2_ClientMapCacheService " or "RedisDistributedMapCacheClientService", but this would require you to setup that external HBAs or Redis server with HA yourself. Hope this helps, Matt
... View more
01-05-2021
10:29 AM
1 Kudo
@Boenu Beginning with Apache NiFi 1.12 a change was implemented that changed the default anonymous access to static resources to false. This was done as part of https://issues.apache.org/jira/browse/NIFI-7170. This change also added the ability to add an additional property to the nifi.properties file to restore the behavior to that of Apache NiFi 1.11 and older versions: nifi.security.allow.anonymous.authentication=true The above restores previous behavior while work is done to change the how NiFi handles access to these static endpoints. The following Jiras cover that work: https://issues.apache.org/jira/browse/NIFI-7849 https://issues.apache.org/jira/browse/NIFI-7870 Hope this helps, Matt
... View more
01-04-2021
08:30 AM
@kalhan While it is possible to have a single ZK cluster to support multiple services, It is the recommendation that NiFi have its own dedicated ZK cluster. NiFi cluster stability is dependent on ZK and many of the NiFi processors that can be used depend on on Cluster state which is also stored in ZK. IF ZK becomes overburdened it can affect overall stability and performance of NiFi. If you found any of the answers provided on this query helped you, please select "accept solution" on each of them. Thank you, Matt Hope this helps.
... View more
01-04-2021
08:23 AM
1 Kudo
@adhishankarit As I mentioned you need an additional unique attribute that you only add on the failure path (Construct HDFSError UpdateAttribute) to MergeContent. overall-status = ERROR Since this attribute (overall-status) is not being set on the success path, the mergeContent "Attribute strategy" set to " Keep All Unique Attributes" will then set this overall-status attribute on the merged FlowFile produced. Keep All Unique Attribute --> any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. Since you are not setting this attribute on your success path FlowFiles, it would only be set on mergeFlowFiles where one of more FlowFiles traversed the failure flow path. This allows you to capture overall-status of the zip bundle. Then in your ReplaceText processor you would use a more complex NiFi Expression Language (EL) in your replacement value. Something like: ${uniquefile}:${overall-status:isNull():ifElse('success','${overall-status}')}:${message} This will set "success" if the "overall-status" attribute does not exist on any FlowFiles that were part of the merged FlowFiles; otherwise it will set the it to the value set in the "overall-status" attribute. If you found this help, please take a moment to click "accept solution" on all responses that helped. Matt
... View more
12-28-2020
10:33 AM
@adhishankarit You dataflow screenshot does not reflect the entire dataflow you are then trying to describe making this use case hard to follow. 1. Your flow starts with a single zip file? 2. You unzip that file to produce numerous output FlowFiles? 3. You use load balanced connections to distribute all the produced FlowFiles across all nodes in your cluster? 4. Then you modify content of FlowFile using convertAttribitesToJson processor (Destination=flowfile-content)? Looks like you route the "success" relationship twice from this processor which means you have cloned your FlowFiles. Why? 5. One of these connection looks like it uses a Load Balance connection (how is it configured?) to feed a MergeContent. MergeContent can not merge across multiple nodes (can only merge FlowFiles on same node. How is MergeContent configured? Your desire output does not look like Json, but you are using AttributesToJson processor? 6. Where do the "failure" FlowFiles get introduced in to this dataflow? When you unpack your original FlowFile each produced FlowFile will have new attributes set on it to include segment.original.filename, fragment.identifier, fragment.count, and fragment.index. These attributes can be used with the "defragment" merge strategy in MergeContent. So I would avoid cloning FlowFiles post unpack. Process each FlowFile in-line. When you encounter a "failure", set an attribute on these FlowFiles only that states a failure occurred (successfully processed FlowFiles should not have this unique attribute). Then use MergeContent and set keep all Unique Attributes. This will allow the unique attribute if exists on any one FlowFile to show up on the output merged FlowFile will not work if same attribute exists on multiple FlowFiles with different values). Now after merge you can modify the content again using ReplaceText processor configured with Append to add the first line with overall status of this file from that unique attribute you preserved through the merge. Also not following statement: " also noticed that if there is a delay in processing " Hope this helps, Matt
... View more
12-28-2020
09:59 AM
@JelenaS You would need to share some screenshots of the policies/permissions you have set on the bucket(s) you have created in your NiFi-Registry. - Go to "Settings" (wrench icon upper right corner within NiFi-Registry - under "BUCKETS" click pencil icon for bucket you expect your user to see - Your NiFi user which is logged in to NiFi should have write,delete,read on the bucket. Would also be helpful to what "Special Privileges" you have set for each of your NiFi nodes inside NiFi-Registry as well. - Go to "Settings" (wrench icon upper right corner within NiFi-Registry - under "USERS" click pencil icon for each of your NiFi nodes - Each of your NiFi nodes (case sensitive) should have "Can proxy user requests" and read on "Can manage buckets" checked.
... View more
12-24-2020
07:35 AM
@adhishankarit There is nothing you can pull form the NiFi Rest-API that is going to tell you about successful outcomes form processor execution on a FlowFile. Depending on data volumes, this also sounds like a resource expensive endeavor. That being said, NiFi does have a Site-To-SIte (S2S) Bulletin reporting task. When a processor throws an error it will produce a bulletin and this reporting task can capture those bulletins and send them via NiFi's S2S protocol to another NiFi instance directly into a dataflow were you can handle via dataflow design however you like. Only way you can get INFO level logs in to bulletins is by setting the bulletin level to INFO on all you processors. This only works if you have also configured your NiFI logback.xml so that all NiFi components log at the INFO level as well. Downsides to this: 1. Every processor would display the red bulletin square in upper right corner of processor, This makes using this to find components that are having issues difficult 2. This results in a lot of INFO level logging to the nifi-app.log. You mention edge nodes. You could setup a tailFile processor that tails the nifi-app.log and then send via a dataflow that log data via FlowFiles to some monitor NiFi cluster where another dataflows parses those records via a partitionRecord processor by log_level and then routes based on that log_level for additional handling/notification processing. Downside here: 1. Since you what to track success, you still need INFO level logging enabled for all components. This means even this log collection flow is producing log output. So large logs and logs being written to even when actual data being processed in other flows is not happening. NiFi does have a master bulletin board which you could hit via the rest-api, but this does not get you past the massive logging you may be producing to monitor success. https://nifi.apache.org/docs/nifi-docs/rest-api/index.html Hope this gives you some ideas, Matt
... View more
12-24-2020
07:15 AM
@adhishankarit This only works if intent is to replace or append the FlowFile Attributes to the content of the FlowFile. But you will still not result in a single FlowFiles will all 15 attributes. ReplaceText does not merge anything and only has access to FlowFile Attributes on the FlowFile that is being executed upon. You would still need to merge the content of those 2 FlowFiles to have a single FlowFiles with all 15 attributes which would exist in the content of the new FlowFile.
... View more
12-24-2020
07:10 AM
@wasimakram054 First we need to answer some questions here: 1.What about the content of these two FlowFiles? Do both have same content? How do you want to handle that when you merge these FlowFIles? 2. Are all 15 of these attributes unique? Meaning that the same attribute name does not exist on both FlowFiles. Scenario 1: Let's assume both FlowFiles have same content. 1. Then you could user modifyBytes processor to remove all content from only one of those FlowFIles. 2. Then use Merge Content processor to merge those two FlowFiles and set property Keep all unique attributes. This will result in one FlowFile with content from only FlowFile that had content and all unique attributes from both source FlowFiles set on output new FlowFile. Scenario 2: Each FlowFile has unique content. 1. You could still use MergeContent just as in Scenario one and the resulting FlowFile will have all teh unique attributes and the merged content from both source FlowFiles. Another option: You could use maybe the putDistributedMapCache processor to write the desired attributes to a cache server. Then use the FetchDistributedMapCache to consume the needed attributes and place them on the other FlowFile. This will not perform as well and you need to consider under volume that you only process one set of FlowFiles at a time. Can be done but not as performant and adds complexity to dataflow design. Hope this helps, Matt
... View more
12-23-2020
01:39 PM
@Anurag007 You did not share how your logs are getting in to your NiFi. But once ingested, you could use a PartitionRecord processor using one of the following readers to handle parsing your log files: - GrokReader - SyslogReader - Syslog5424Reader You can then use your choice of Record Writers to output your individual split log outputs. You would then add one custom property that is used to group like log entries by the log_level This custom property will become a new FlowFile attribute on the output FlowFiles. You can then use a RouteOnAttribute processor to filter out only FlowFiles where the log_level is set to ERROR. Here is a simple flow I created that tails NiFi's app log and partitions logs by log_level and and then routes log entries for WARN or ERROR. I use the GrokReader with the following GrokExpression %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:thread}\] %{DATA:class} %{GREEDYDATA:message} I then chose to use the JsonRecordSetWriter My dynamic i added property in the Property = log_level
Value = /level In my RouteOnAttribute processor, I can route based on that new "log_level" attribute that will exist on each partitioned FlowFile using two dynamic property which each become a new relationship: property = ERROR
value = ${log_level:equals('ERROR')}
property = WARN
value = ${log_level:equals('WARN')} Hope this helps, Matt
... View more
12-23-2020
12:48 PM
@JelenaS You are correct that this sounds like an authorization issue. I recommend tailing the nifi-registry-app.log and then perform the action of trying to version control a Process Group within NiFI's UI. How are you handling user authorization in your NiFi and NiFi-Registry? - File based authorization (users.xml and authorizations.xml) What identity.mapping patterns have you configured in your NiFi and NiFi-Registry? How are you authenticating your user that access both NiFi and NiFi-Registry? The only buckets that would be returned are those buckets for which the Authenticated user in NiFi has access to in NiFi-Registry. Keep in mind that the user/client strings in NiFi that are passed to NiFi-Registry must match exactly. Nodes will pass their full DN when they proxy the request on behalf of the authorized user. The user string will passed as is. That means identity mapping patterns will be applied on NiFi-Registry side against those NIFi DNs. Resulting mapped value must match the client string add as a user in NiFi-Registry. The passed user string must match exactly (case sensitive) or it is treated as a different user. Hope this helps, Matt
... View more
12-23-2020
08:28 AM
@pjagielski It is always helpful to share the exact NiFi version you are running as there may be known issues we can point you. Assuming here that you may be running latest Apache NiFi 1.12 release, my first thought may be related to this issue: https://issues.apache.org/jira/browse/NIFI-7992 While your content repo is not filling up, I would suggest inspecting you logs to see how often Content claims are being moved to archive. A background thread then removes those claims as a result of your archive settings. Hope this helps, Matt
... View more
12-23-2020
07:46 AM
@Nyk That is correct, dynamically added properties are all of type non-sensitive. You would need to build a custom processor with static configurable properties that have a PropertyDescriptor with ". sensitive(true)". I am not a developer myself, but you may find this resource useful: https://itnext.io/deep-dive-into-a-custom-apache-nifi-processor-c2191e4f89a0 If you found my answer addresses your question, please click on "Accept as Solution" below the answer. Hope this helps you, Matt
... View more
12-23-2020
07:27 AM
1 Kudo
@murali2425 I was not able to reproduce the missing attributes in the content of a produced FlowFile from the AttributesToJson processor. What version of NiFi are you using? Did you inspect the attributes on the FlowFile in the immediate connection feeding the AttributesToJson before starting that processor? Your desired output from the AttributesToJson seems to be very specific and not include all attributes including the core attributes anyway. My suggestion would be to use an UpdateAttribute processor just before your AttributesToJson processor to build the specific attributes you want to have in your produced Json output content.
You would then add two custom dynamic properties where you would use NiFi Expression Language to populate the values from other attributes/metadata on the source FlowFile:
You could then configure your AttributesToJson processor to build the JSON content using only those two new attributes you just constructed:
Keep in mind that the AttributesToJson processor will add attributes to the Json in Lexicographical order. So if you want the uuid before the filepath, you will need to adjust the property names used in the UpdateAttribute processor. For example "auuid" instead of "myuuid" so that it comes before "filepath" in order. Hope this helps,
Matt
... View more
12-23-2020
06:34 AM
@Nyk You'll need to provide more context around your use case to get a good answer. It is not clear what you are trying to accomplish here. - Are you trying to build your own custom NiFi processor where you want to define some configurable properties as type sensitive? - Are you trying to use the "+" icon to add a new dynamic property to an existing NiFi processor and hide/encrypt the value set in that property? In this case, that is not possible since all dynamic properties are nonsensitive. The use of invokeScriptedProcessor you saw was a recommendation made to sue it rather than ExecuteScript processor. That is because you can define properties via the invokeScriptedProcessor custom script you write additional processor properties to be used within that processor only as sensitive. But looks like there may be some issues around that ability outlined here: https://issues.apache.org/jira/browse/NIFI-7012 Hope this helps, Matt
... View more
12-23-2020
06:15 AM
@te04_0172 It appears you have hit a known issue: https://issues.apache.org/jira/browse/NIFI-7954 https://issues.apache.org/jira/browse/NIFI-7831 Looks like these will be addressed in Apache NiFi 1.13 These fixes have already been incorporated in to the Cloudera HDF 3.5.2 release that is currently available. Hope this helps, Matt
... View more
12-01-2020
06:53 AM
1 Kudo
@dzbeda Seeing "localhost" in your shared log output leads to what may be the issue. When you configure the URL in the Remote Process Group (RPG), it tells that NiFI RPG to communicate with that URL to fetch the Site-To-Site (S2S) details. Included in those returned details are things like: - number of nodes in target NiFi cluster (if standalone returns only one host) - The hostnames of those node(s) (in this case it looks like maybe localhost is being returned - Configured RAW port if configured - Whether HTTP transport protocol is enabled - etc... So when you RPG actually tries to send FlowFiles over S2S it is trying to send to localhost which results in itself rather than the actual target linux NiFi it fetched the S2S details from. When some properties are left unconfigured, NiFi returns whatever the OS resolves. I am going to guess your linux server is returning localhost rather than the actual hostname. You will want to verify your S2S configuration setup in the target NiFi (linux server): http://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#site_to_site_properties Try setting the " nifi.remote.input.host" to see if that helps you. Hope this helps, Matt
... View more
11-23-2020
05:56 AM
@venkii User/client authentication via a user certificate and Authentication via login provider are handled completely differnetly Looking at your login-identity-providers xml, I see you have configured the following: <property name="Authentication Strategy">START_TLS</property> However, you have not configured any of the TLS properties in the provider. Are you sure "START_TLS" is what you want to be using here? Your ldap URL looks to be using just ldap:// and the default unsecured port. If that is the case the "Authentication Strategy" should be set to "SIMPLE" instead of "START_TLS". The exception points to at an SSL handshake exception. It implies that a server certificate returned by the ldap server did not match (IN either DN or SAN entry) the hostname used in your ldap URL configured in the login provider. So in this case you either need to switch to "SIMPLE" (if possible) or validate the server certificate being returned by your ldap server and setup needed TLS properties in your provider. Hope this helps, Matt
... View more
11-23-2020
05:34 AM
@dzbeda Can you share a little more about your use case? NiFi does not expire data that is actively queued within connections between components added to the NiFi canvas. So I am a bit curious on the " I don't want to lose data" statement you made. It is true that during times of "connectivity issues between the sites" that NiFi FlowFile may accumulate within the connection queues resulting in more storage being needed to hold that queued data while you wait for the connectivity to restore, but still not a concern for "data loss" unless your ingest is using some unconfirmed transfer protocol like UDP. NiFi's Site-To-Site protocol used by the Remote Process Groups uses a two phase commit to avoid dataloss. Backpressure settings on each connection can control how many FlowFiles can queue before the component feeding FlowFiles into the connection is o longer allowed to execute. So in an extended outage or high volume, backpressure could end up being applied to all connection from last component in your dataflow to the first component in your dataflow. Default object thresholds are (10,000 FlowFiles or 1 GB of content size). Keep in mind these are soft limits. Not advisable to simply set backpressure to some much larger value. I recommend reading following article: https://community.cloudera.com/t5/Community-Articles/Dissecting-the-NiFi-quot-connection-quot-Heap-usage-and/ta-p/248166 As far as what happens when the content repo(s) (NiFi allows you to configure multiple content repos per NiFi instance) are full, NiFi simply can not generate any new content. So any component that tries to create new content (can be at ingest or via some processor that modifies the content of an existing FlowFile) will simply fail went it tries to do so with an out of disk space exception. This does not mean dataloss (unless as I mentioned your ingest or egress uses an unconfirmed protocol). The component will simply try again until it is successful once disk space becomes available (For example when connectivity returns and data can be pushed out). Using good protocols would result on data remaining on source once backpressure is applied all the way back to your ingest based components. NiFi archiving has nothing to do with how long FlowFiles are kept in NiFi's dataflow connections. Archiving holds FlowFiles after they have successfully been removed (reached point of auto-termination in a dataflow. Archiving allows you to view old FlowFiles no longer queued or replay a FlowFiles from any point in your dataflow. However, there is no bulk replay capability, so not useful for that. https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418 Hope this helps, Matt
... View more
10-22-2020
08:27 AM
@SandeepG01 What the ERROR is telling you is that some component the references the parameter context failed to transition to a stopped state. When a parameter context is defined in a component (NiFi processor for example), and change to the value assigned to that parameter would require the restart of those components so that the new parameter value is read in by the component. When a component like a NiFi processor is asked to stop, it first transitions into a state of "Stopping". Once any active threads on that component complete the processor transitions to a "Stopped" state. It as this point the component could be started again. So there are two possible scenarios that can happen here Since NiFi will not interrupt an active thread: 1. A component had a ling running thread that the restart process timed out waiting for it to transition to stopped (thread to complete and exit). It is quite possible that this component did finally finish its thread execution post this timeout and another attempt to apply the parameter context change would not be successful. 2. A component has a hung thread. Meaning the active thread on the component is making no progress because it is either waiting indefinitely on some response action or it being blocked by some other thread. In this case it would never complete its execution and transition to the "Stopped"state. First you need to identify the processor with the hung thread (Processors show a small number in t he upper right corner when a thread is active). Then you can take a series of thread dumps (./nifi.sh dump <dump-file-01>) and compare those thread dumps to see if the you have a thread stuck in the same state in every dump (shows no thread progress). The only way to kill a hung thread is via a NiFi restart. NiFi processors offer a terminate option, but that does not kill the tread. It simply disassociates that thread with the processor (think of putting in an isolation box). The flowfile that was being processed by that thread is rolled back on to the incoming connection. Should that terminated thread ever return anything (lets say it was not hung, but just a very long running thread), it output is just sent to null since it was terminated. But again that thread if truly hung will not go away until NiFi restart even if you selected terminate on the component. You many also inspect the nifi-app.log for the ERROR exception to see if it is followed with a stack trace that may give more insight. Hope this helps, Matt
... View more
10-19-2020
08:34 AM
@sarath_rocks55 If you are looking for assistance with an issue, please post a "question" in the community rather than adding a comment to a community article. Comments on community articles should be about the article content only. Thank you
... View more
10-13-2020
05:59 AM
@nikolayburiak Have you tried defining the keytab and principal directly in the the Hbase_2_ClientService configuration rather than using the KeytabCredentialsService to see if ticket renewal works correctly? This may get you pas the issue now and also help identify if issue is potentially with the controller services. Thanks, Matt
... View more
09-24-2020
11:34 AM
1 Kudo
@Jarinek I am not completely clear what you mean by " needs to be initialized by data". NiFi Processor components transfer FlowFiles between processors via connections. Those connections can consist of one or more relationships. Relationships are defined by each processor components code. There are many stock processors for ingesting data (for example: ListenTCP, ListenHTTP, QueryDatabase*, SelectHive*, etc...). From an input Processor component the FlowFile would be routed if successful to the "success relationship. That "success" relationship would be routed via a connection as input to your custom processor. Your custom processor code would then need to pull FlowFile(s) from the inbound connection(s) queue, process it and then place the resulting FlowFile on one or more relationships defined by your processor code based on the outcome of that processing. There are numerous blogs online with examples on building custom NiFi components: I suggest starting by reading the Apache NiFi developers guide: https://nifi.apache.org/developer-guide.html Then look at some sample blogs like: https://community.cloudera.com/t5/Community-Articles/Build-Custom-Nifi-Processor/ta-p/244734 https://www.nifi.rocks/developing-a-custom-apache-nifi-processor-json/ https://medium.com/hashmapinc/creating-custom-processors-and-controllers-in-apache-nifi-e14148740ea *** Note: Regardless of what you read in the above blogs, keep in mind the following: 1. Do NOT add your custom nar to the default NiFi lib directory. It is advisable that you define a custom lib directory in the nifi.properties file just for your custom components. Refer to the Apache NiFI Admin Guide for more detail: https://community.cloudera.com/t5/Community-Articles/Build-Custom-Nifi-Processor/ta-p/244734 2. Avoid building more functionality then needed in to a single processor component. It makes reuse in different use case harder. Hope this helps, Matt
... View more