Member since
01-27-2023
126
Posts
31
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
64 | 05-31-2023 03:01 AM | |
153 | 05-22-2023 06:55 AM | |
154 | 05-15-2023 05:33 AM | |
302 | 05-10-2023 01:57 AM | |
100 | 05-09-2023 11:40 PM |
03-10-2023
03:04 AM
maybe you can try with something like? I have no access to any MongoDBs right now, to test it myself 😞 { "TransactionHistory" : { " $gt " : "<your_value_here>" }}
... View more
03-10-2023
02:50 AM
I see that you are using something which is not default nor belonging to NiFi. I would suggest you have a look in your JAR Files from PROD and see if you can find something which might point to something like batchiq. Most likely the JAR file is missing from your dev environment.
... View more
03-09-2023
11:48 PM
1 Kudo
@anoop89I never used file-identity-provider so I am not really experienced with that 😞 would it be possible to provide a short snippet from conf/login-credentials/xml? You can remove all the PII data and replace them with something dummy, but I would really like to see how the file is structured and try to reproduce the behavior on my local device. Was this file generated automatically or have you created manually and kept on using on your prod/dev instances? PS: are you using the NiFi Cloudera version?
... View more
03-09-2023
11:21 PM
Hi @deepak123 , what do you mean by Nifi is performing slow? Based on your question, it is not very clear in which point you encountering performance issue. When waiting for an answer from your InvokeHTTP endpoint or when doing some actions on the result of the API call?
... View more
03-09-2023
11:15 PM
1 Kudo
hi @anoop89 I can confirm you that version 1.19.1, 1.20.1 work very well without ldap or kerberos. I have installed two clusters, one in which there is no security active (cluster not secure) and one in which I have only activated the login with a single user and password. But here I think it mostly depends on the version (the open-source, the Cloudera version, etc) you are ussing. What I can tell from your logs is that you might have defined a false class for your login identify provider. By default, when I have unzipped the NiFi ZIP File, the nifi.properties file contained the following lines: nifi.login.identity.provider.configuration.file=./conf/login-identity-providers.xml
nifi.security.user.login.identity.provider=single-user-provider The login-identify-providers.xml is defined as seen below, but you have two other options which are commented: LDAP(<identifier>ldap-provider</identifier>) and KERBEROS (<identifier>kerberos-provider</identifier>) <provider>
<identifier>single-user-provider</identifier>
<class>org.apache.nifi.authentication.single.user.SingleUserLoginIdentityProvider</class>
<property name="Username"/>
<property name="Password"/>
</provider> Maybe you are trying to use the option file-provider from within the authorizers.xml file, which comes by default as commented and it is not recognized when starting NiFi? I think that your best solution here would be to compare the configuration files from your Dev Environment with the configuration files from your PROD Environment. By doing that you will identify where you defined the wrong property and you can correct it straight away.
... View more
03-09-2023
10:45 PM
hi @moahmedhassaan, it would really help if would provide more details about your flow, even how you query your data. Having and testing a MongoDB is not easy for everybody because there are not lots of people who have it available. Nevertheless, there might a solution, not very efficient but it might do your thing: Add a GenerateFlowFile Processor, where you configure a property like : ${now():format("yyyy-MM-dd HH:mm")}:${now():format("ss"):minus(1)}. Set this processor to run only on the primary node so you won't have to many generated files. Send the success queue to your GetMongo Processor. Within the GetMongo Processor, in the query property you write your query with the condition on transactiondate > The_property_defined_in_Generate_Flow_File. Again this is not a very bright solution, but it could suite your needs until somebody with more experience can provide you with a solution 😊
... View more
03-07-2023
01:04 AM
2 Kudos
I would try something like: ${value:divide(3600)}:${value:divide(60):mod(60)} value = your attribute. value:divide(3600) = identify the hours. value:divide(60):mod(60) = identify the minutes. Give it a try and let me know if it works fine for you 😀 LE: If you want the leading 0 as well, if the hour value is lower then 10, try something like: ${value:divide(3600):lt(10):ifElse(${value:divide(3600):prepend(0)},${value:divide(3600)})}:${value:divide(60):mod(60)} It is basically the same thing as before, but instead, you are using an IF-ELSE to check whether the the value for the hours is lower then 10 and if so, you append a leading 0 to it so it will display as 0X:MM. If the value is greater then 10, you stick with the original value, without adding any new zeros.
... View more
03-03-2023
05:39 AM
4 Kudos
Try using the NiFi REST Api to fetch the desired information 😁 You can get a list of all your reporting tasks by executing a call on the following link: https://<hostname>:<port>/nifi-api/flow/reporting-tasks This will provide you with a JSON list of all your reporting tasks. If you want to go in a specific reporting task, you need to make a call to: https://<hostname>:<port>/nifi-api/reporting-tasks/<id-of-controller> You can go ahead and play with the api calls until you fetch the perfect data for you. All the available calls can be found here: https://nifi.apache.org/docs/nifi-docs/rest-api/index.html
... View more
Re: Troubleshooting Steps for Launching Apache NiF...
02-27-2023
03:40 AM
02-27-2023
03:40 AM
First if all, it would really be helpful to know what OS you are using to run NiFi. Based on the fact that you are using a laptop, I will go ahead and assume that you are using a Windows Device. If so, the first thing I would do is to check whether the CMD windows stays open when starting NiFi. If not, than there is a problem and you have to further look into it. Nevertheless, if anything happens, it should write something in the log files (no matter we are speaking about the nifi-app.log or the nifi-user.log or the nifi-bootstrap.log). Failing to start NiFi doesn't really require an Error or something and it might be useful to add the output of the log files here. PS1: if using a newer version of NiFi, check nifi.properties if you wrote something in the following property: nifi.sensitive.props.key PS2: make sure that you have JAVA in your Environment Variables.
... View more
02-27-2023
03:21 AM
So I am using NiFi in a cluster mode, with 5 machines. My Flow starts with an GenerateTableFetch, executed on the Primary Node only. As properties, I have set the Partition Size equal to 4000000. The connection is the standard one for an Oracle Database, and my table is a generic table with more then 50M rows. Now, the downstream connection (success) from GenerateTableFetch is linked to an ExecuteSQLRecord. The connection is set with the Load Balance Strategy = Round Robin and Selected Prioritizers = FIFO. Within the ExecuteSQLRecord, I have configured the following properties: Use Avro Logical Type = true, Max Rows Per Flow File = 4000000, Output Batch Size = 5, Set Auto Commit = true. When it comes to scheduling, I have 7 Concurrent Tasks and All nodes as Execution. Once the data is converted into AVRO/Parquet (tried with both), I load them into a GCP Bucket/AWS S3. Afterwards, I load the data into their data warehouse service. The problem is that when I check my data, even though I have the same number of rows like in the Source Table, the data is not the same. I have plenty of duplicate rows and I managed to pin point the one causing the duplication of data: ExecuteSQLRecord. But I cannot figure why this is happening nor how I can solve it. I have tried setting the ExecuteSQLRecord on debug to see exactly what happens. Based on the logs, even though I am using 7 tasks X 5 nodes, I am only seeing 8-10 log lines: ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 40000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 72000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 56000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 16000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 40000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 56000000 ROWS FETCH NEXT 4000000 ROWS ONLY
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] Executing query SELECT * FROM ORCL.MY_TABLE_NAME WHERE 1=1 OFFSET 40000000 ROWS FETCH NEXT 4000000 ROWS ONLY However, once the ExecuteSQLRecord finishes the job, I see that all the files have been processed. Unfortunately, when creating an external table (GCP,AWS, no matter where) I have duplicate rows. (and no, I do not have duplicate rows in my source table) ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_76000000_a1748d5f-f2dd-490c-b37c-daf54cc52391] contains 1133992 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_0_9a513a8f-ae5a-449d-b00c-6cf8a2ee7322] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_4000000_9c0fe0b3-7c12-4756-88e7-d16b099802d3] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_8000000_8e6909e5-3f13-41d4-b8cf-0429d49bf77b] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_12000000_222954b6-ca01-4c28-aa7b-3e02ef35892b] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_16000000_a1f51ef6-d592-4eb8-bbb9-d8adf5b37434] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_20000000_2334ca8b-91a6-4dcd-8adc-80dcc7163484] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_24000000_60e6f45b-41dd-4d0d-9853-5d471180d2e5] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_28000000_f6a23cf4-043b-4f6b-9025-7016bb5b2f28] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_32000000_33b53024-126f-402c-be88-86dcdf85dfe1] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_36000000_1572e725-8eff-4890-bdca-9c3c9f6b7215] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_40000000_5100fbb5-718b-447a-b279-304ac65a3249] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_44000000_056f47bf-1cbf-4b93-a2f6-a5300711df5e] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_52000000_e7f70f72-772a-4e69-b0d2-d037da1c4127] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_48000000_4d86844c-e56a-44d8-8b8c-cc025a4e9c83] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_56000000_75eb376e-1083-44d7-a781-db1b14340f64] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_60000000_b5b8f685-6af1-4f4a-8ecb-d47ec13099c8] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_68000000_968cad60-e64b-4ace-8382-32d10b5d96d4] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_64000000_cee6c80e-1c5c-4eba-a3b0-b3d40dc3d348] contains 4000000 records; transferring to 'success'
ExecuteSQLRecord[id=2b6f35be-d65c-19cd-9b43-96f87b96fad1] FlowFile[filename=4000000_72000000_7aeda32c-9dac-428d-bcc2-4dcd86f7c428] contains 4000000 records; transferring to 'success' Has anybody else struggled with a similar problem? How did you manage to solve it? Thanks 🙂
... View more
Labels:
- Labels:
-
Apache NiFi
- « Previous
- Next »