Member since
07-29-2020
574
Posts
323
Kudos Received
176
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2155 | 12-20-2024 05:49 AM | |
| 2451 | 12-19-2024 08:33 PM | |
| 2201 | 12-19-2024 06:48 AM | |
| 1463 | 12-17-2024 12:56 PM | |
| 2101 | 12-16-2024 04:38 AM |
12-01-2023
12:31 PM
2 Kudos
@ChuckE, Regarding the state being persistent, actually you can clear the state by right click on the processor then select "View state" and then click on the "Clear state" link: This should reset the state to the initial value. Regarding your second question of being able to initialize more than one variable, you can define as many stateful variables as you need in one processor, however they all can have one initial value. If you are looking to define different initial values for different stateful variables then you have to create a different UpdateAttribute processors for the variables that have common initial value. Another option - I never tried - if you want to utilize one processor then you can use Advanced option to define rules that will set different initial value based on a common conidition, however you have to be careful how you set the first value on the first flowfile. For example, if you have two stateful variables Attr1 & Attr2 where the first flowfile has Attr1 = 0 & Attr2 = 1 then increment both afterward, you can define the UpdateAttribute as follows : Notice how I set the Initial Value to Empty String since its required to set some value when using stateful. Under Advanced I defined two rules: one to initialize Attr1 to 0 and another to initialize Attr2 to 1 when each is set to empty string : Rule for Attr1: Rule for Attr2: Make sure to set the FlowFile Policy top right to "use original" , otherwise it will duplicate flowfiles for each matched rules if "clone" is used. When setting the same attribute in Advanced mode and in basic mode, the Advanced will take precedence if rules are met, so the first time the increment wont run. The first flowfile will have the initial values since the rules are satisfied. The second flowfile the rules are not satisfied therefore the increment will happen: Depending on what you want to see on the first flowfile then you can adjust the initial values accordingly. Not sure if this will work for all scenarios but you can try, otherwise use different processor as I stated above. Also if anyone thinks that goes against best practices or might cause problem please advise. If you find this helpful please accept solution. Thanks
... View more
12-01-2023
07:30 AM
1 Kudo
Hi @Fayza , I assume you set the credentials in the Request Username & Request Password. Can you set the "Response Generation Required" to true? this will make sure the processor is capturing all kind of responses regardless of the Http status. Another thing do you have to have any special headers attribute that needs to be part of the request ? for example Content-Type which corresponds to Request Content-Type property , or any custom header?
... View more
12-01-2023
05:37 AM
3 Kudos
@ChuckE, This worked for me : UpdateAttribute Configuration: Notice the "Stateful Variables Initial Value" is set to 1 The UpdateAttribute success relationship has the result flowfile with the following attribute: If I run the flow again , the new flowfile will have the following attribute: Its strange that It did not work for you when set the initial value to 1 ? I dont see anything else wrong. If that still did not work for you, can you try to upgrade Nifi just to see if its not a bug with 1.19.1. I use 1.20 for testing but you can upgrade to the latest. If you find this helpful please accept solution. Thanks
... View more
11-30-2023
03:21 PM
@yan439, Im not sure I understand. I thought you have the schema already defined in the registry with the correct column names and data types. Can you elaborate more on how the avro schema came about and if its the same thing you are using the in the registry?
... View more
11-30-2023
07:40 AM
Thanks @MattWho , As far as the managed Managed-Authorizer, I usually configure my access using LDAP provider but without providing my AD account any access I wont be able to log in to Nifi. I use the Single-User-Provider with the auto generated username and password to grant myself access in Nifi before I change to ldap-provider and be able to log in. Not sure if this is the right way to do it. let me know what you think. Thanks
... View more
11-30-2023
07:32 AM
Hi @scoutjohn , Your spec can be written as follows: [
{
"operation": "shift",
"spec": {
"*": "&",
"serviceOrderItem": {
"*": {
"*": "serviceOrderItem.[&1].&",
"service": {
"*": "serviceOrderItem.[&2].service.&",
"supportingService": {
"$": "serviceOrderItem.[&3].service.serviceCharacteristic[#].name",
"@": "serviceOrderItem.[&3].service.serviceCharacteristic[#].value"
}
}
}
}
}
}] I did not not add the "modifyPath" because I did not see anything related to this object in the provided json input. Notice how I used # for serviceCharacteristic[#].name & serviceCharacteristic[#].value which tells the spec to group everything under one object under the serviceCharacteristic array. If you find this helpful please accept solution. Thanks
... View more
11-29-2023
06:16 AM
1 Kudo
Since you said that my proposed solution work for the first part can you accept the solution and then open new ticket for the latest question because its a little different from your first question. Also regarding your requirement in the latest post, I'm having a hard time understanding what you are trying to do and I have the following questions that I hope you can answer or clarify better if you decide to open new ticket: 1- What do you mean by max queue size is 250? how do you set that ? Is it a batch process where each batch you process total of 250 or at a given time you should not have more than 250 flowfiles 2- You say that you want them to Retire in 24 hours, is this for the 250 flowfile ? if that is true then how is this going to work when you say that you want to retire a flowfile every hour for 24?! This is so confusing to me 3- Are you saying that after you retire (completed 24 )all the files in the queue then you want to log a message? Do you mean if all the 250 flowfiles fail and retire? then how this is going to work when there some files succeed and other failed ? It would help also if you post screenshot of the complete flow and highlight what you want to do in each process and which part of the flow you are having problem with detailing clearly what the problem is related to the target processor (publishkafka in your case ) and what is the expectation Thanks
... View more
11-28-2023
03:13 PM
Hi, I have managed to download the latest Nifi 2.0.0 M1 and I'm trying to run it on my windows 10 machine. Doing some preliminary testing I ran into the following issues: 1- The system requirement indicates that (https://nifi.apache.org/project-documentation.html ) indicates that at minimum I need Java 17, but when I try to start nifi using run.bat I get the following error: Error: LinkageError occurred while loading main class org.apache.nifi.bootstrap.RunNiFi
java.lang.UnsupportedClassVersionError: org/apache/nifi/bootstrap/RunNiFi has been compiled by a more recent version of the Java Runtime (class file version 65.0), this version of the Java Runtime only recognizes class file versions up to 61.0 It turns out it needs Java 21. Not sure if the documentation has not been updated or if Im missing something. 2- After upgrading to Java 21, Im able to start nifi using default configuration, the log file doesn't show any error and default username and password are generated, however when I try to browse for https://127.0.0.1:8443/nifi I get the following error: Not sure if this is something local to my machine but upon some internet search, I replaced url from 127.0.0.1 to localhost and it worked as I get to the log in screen. 3- This is not related to to 2.0 but I Want to mention in case someone else runs into the same issue. Basically by default, the generated user doesnt have access to security settings regarding Users & Policies. To enable this you need to set the : nifi.security.user.authorizer=managed-authorizer And add the generated username to the authorizers.xml as mentioned here : https://community.cloudera.com/t5/Support-Questions/No-show-Users-and-Policies-in-Global-Menu/td-p/339127 4- The ExecuteScript processor doesnt have Python(Jython) script engine. It could be its deprecated , but that is not mentioned in the depricated components site (https://cwiki.apache.org/confluence/display/NIFI/Deprecated+Components+and+Features ) . It only talks about removing support for Ruby , ECMAScript but not python . If its deprecated , what is the alternative ? Is it using Python API ? 4- Minor glitch I noticed when browsing nifi using chrome , for some reason the "Import from Registry" Icon is not showing! It shows up in Edge and it shows up if I open chrome in private mode. Not sure if its caching issue or what. Please advise. Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
11-28-2023
09:17 AM
2 Kudos
Hi @Rohit1997jio , Not sure if this can be done using the FlowFile Expiration. However, if you are using Nifi 1.16 or higher you can take advantage of another methodology using the "retry" option on the target processor failure relationship as follows: The concept here is to use the settings for "Number of Retry Attempts" , " Retry Back Off Policy" & "Retry Max Back off Period" to configure how often and for how long the file is retried before it gets pushed to the failure relationship queue where you can then log the needed message. Every failed retry, the flowfile will be pushed back to the upstream queue and wait the designated time before its tried again. The challenge here is how to set those values so that the flowfile is only kept for certain period of time ( 1 hour in your case) specially the file will wait in the queue before its tried again depending wither you set the policy to Penalize or Yield, which is a good thing because you want to have some delay before the flowfile is tried again to avoid a lot of overhead. For example if you want the file to expire in an hour , and you want to try it 60 times where each time you wait 1 min before the next retry then you can set the values as follows: Number Of Retry Attempts: 60 Retry Back Off Policy : Penalize ( Set the Penalty Duration under Settings Tab to 1 min) Retry Maximum Back Off Period: 1 min ( this to ensure that the wait time in the queue doesnt exceed the initial penalty duration because every subsequent retry the duration penalty time is doubled - not sure why- ) In this case the flowfile will be retried 60 times upon failure , where each time the flowfile is pushed back to upstream queue an wait only max 1 min before the next retry , which makes the total time flowfile is retried = 60 * 1 min = 60 mins = 1 hour Depending how often you want to retry and how long you want to wait before the next retry, you can adjust those numbers accordingly. Once done with all the retries the flowfile will be moved to failure relationship where you can log the final message. If that helps please accept solution. Thanks
... View more
11-27-2023
03:08 PM
The issue you are having is when you try to read the parquet file using the ParquetReader where its failing on the invalid column names containing the illegal character "-" . I dont know of a way you can address this in Nifi. You probably have to fix this before you consume through Nifi. You can use pandas dataframe in python to help you remove any illegal characters from column name as an example : import pandas as pd
df = pd.read_parquet('source.parquet', engine='fastparquet')
# replace hyphen with underscore in column names
df.columns = df.columns.str.replace("-","_")
df.to_parquet("target.parquet",engine='fastparquet') Its possible to do this through Nifi as well using ExecuteStreamCommand : https://community.cloudera.com/t5/Support-Questions/Can-anyone-provide-an-example-of-a-python-script-executed/td-p/192487 The steps will be like this: 1- Fetch Parquet from S3 2- Save to Staging area with certain filename using PutFile 3- Run ExecuteStreamCommand and pass filename and path to the py . The py script will rename columns as shown above and save final copy to target folder 4- Use FetchFile to get the final parquet file from target folder using the same filename 5- Convert Record .... If that helps please accept solution. Thanks
... View more