Member since
07-30-2019
3470
Posts
1642
Kudos Received
1018
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 270 | 05-06-2026 09:16 AM | |
| 451 | 05-04-2026 05:20 AM | |
| 331 | 05-01-2026 10:15 AM | |
| 513 | 03-23-2026 05:44 AM | |
| 388 | 02-18-2026 09:59 AM |
09-14-2023
09:55 AM
@manishg Copying a file should have resulted in a new timestamp making it newer than the last file you previously listed. However, moving a file typically does not change timestamps on the file. My guess here is you moved a file into the directory instead of copy? Matt
... View more
09-12-2023
01:30 PM
1 Kudo
@MmSs NiFi is data agnostic. To NiFi, the content of a FlowFile just bits. To remain data agnostic, NiFi uses what NiFi calls a "FlowFile". A FlowFile consists of two parts, FlowFile Attributes/Metadata (persisted in FlowFile repository and held in JVM heap memory) and FlowFile content (stored in content claims within content repository). This way NiFi core does not need to care or know anything about the format of the data/content. It becomes the responsibility of am individual processor component that needs to read or manipulate the content to understand the bits of content. The NiFi FlowFile metadata simply records in which content claim the bits exist and at what offset within the claim the content starts and number if bits that follow. As a far as directory paths go, these become just additional attributes on a FlowFile and have no bearing on NiFi's persistent storage of the FlowFiles content to the content repository. As far as the unpackContent goes, the processor will process both zip1 and zip2 separately. Unpacked content from zip one is written to a new FlowFile and same hold true for zip2. So if you stop the processor immediately after your UnpackContent processor and send your zip1 and zip2 FlowFiles through, you can list the content on the outbound relationship to inspect them before further processing. You'll be able to view the content and the metadata for each output FlowFile. NiFi does not care if there are multiple FlowFiles with the same filename as NiFi tracks them with unique UUID within NiFi. What you describe as zip1 content (already queued in inbound connection to PutS3Object being corrupted if zip2 is then extracted) is not possible. Run both zip 1 and zip2 through your dataflow with putS3Object stopped and inspect the queued FlowFiles as they exist queued before putS3Object is started. Are queued files on same node in your NiFi cluster? Is your putS3Object using "${filename}" as the object key? What happens if you use "{filename}-${uuid}" instead? My guess is issue is in your putS3Object configuration leading to corruption on write to S3. So your issue seems more likely to be a flow design issue then a processor of NiFi FlowFile handling issue. Sharing all the processors you are using in your dataflow and their configuration may help in pinpointing your design issue. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-12-2023
12:58 PM
@manishg NiFi has a High Availability at the control plane and not at the data level. For HA, NiFi utilizes a zero master structure at the NiFi controller level through the use of Zookeeper (ZK). An external ZK cluster is used for electing a NiFi node as the cluster coordinator. If the currently elected cluster coordinator goes down, ZK will elect a new cluster coordinator from the remainder available nodes still communicating with ZK. This eliminates a single point of failure with accessing your NiFi cluster. Any node in the cluster can be used for access. The individual nodes in a NiFi cluster load an execute their own local copy of the flow (flow.xml.gz -older versions and flow.json.gz -newer releases). Each NiFi node also maintains its own set of repositories (database, flowfile, content, and provenance). The flowfile and content repositories only contain metadata/attributes and content for FlowFiles that traverse the dataflows on a specific node. So node 1 has not information about the data being processed on node 2, 3, etc... When a node is down, the current queued FlowFiles on that node remain in its content and flowfile repositories until that node is brought back online or a new node is build where these repositories can be moved (you can not merge existing repositories from different nodes). So it is always best to protect the data stored in these repositories (especially content and flowfile) via RAID to prevent dataloss. As far as your last question about aggregation of processing on different nodes, yoru question is not clear to me. Each node operates independently with the exception of perhaps some cluster wide state which may be stored in ZK. Cluster wide state is primarily used by processors to prevent consumption of same data by different nodes (example listSFTP processor running primary node only and then a change in election happens resulting in different node being elected as primary node. New primary node would start primary node only processors who will retrieve last recorded cluster state and pickup where old primary node processors left off). It is responsibility of dataflow design engineer to construct dataflow(s) on the NiFi canvas that distribute data across the NiFi cluster for proper processing. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-08-2023
12:23 PM
@OpenText-Orion SubjectAlternativeNames would not be full distinquished names (DNs). SANs are used to verify the within the TLS exchange that the client is connected to correct intended target server. This avoid things like man in the middle attacks. So essentially what you have are certificate you created for your 3 NiFi nodes: node1.server.name
node2.server.name
node3.server.name However, when you are trying to connect to a NiFi node, you are entering https://my.elb.name:<port>/nifi in your browser which is directed to a NiFi node. Without a SAN entry present that matches the target hostname, the TLS exchange assumes the request was not intended for this target host resulting in the exception you see. Recreate your node certificates using only Hostnames as the SAN entries instead of a full DN. I executed the following tls-toolkit.sh command you shared and it provided correct expected output: ./tls-toolkit.sh standalone -n 'node1.server.name,node2.server.name,node3.server.name' --subjectAlternativeNames 'my.elb.name' Alias name: nifi-key
Creation date: Sep 8, 2023
Entry type: PrivateKeyEntry
Certificate chain length: 2
Certificate[1]:
Owner: CN=node1.server.name, OU=NIFI
Issuer: CN=localhost, OU=NIFI
Serial number: 18a76360ce500000000
Valid from: Fri Sep 08 19:12:48 UTC 2023 until: Thu Dec 11 19:12:48 UTC 2025
Certificate fingerprints:
MD5: 75:70:0C:4F:41:D8:EA:9D:35:46:9E:C1:3B:9C:B0:E9
SHA1: 5C:0C:CC:B3:C8:29:62:05:5D:5B:C5:BB:71:39:20:40:48:CE:38:A5
SHA256: 17:79:FF:87:31:07:CB:9A:01:A5:82:03:A4:1B:3F:3D:F0:C3:79:21:C6:90:06:82:3D:FC:A1:0A:5F:64:DB:DE
Signature algorithm name: SHA256withRSA
Subject Public Key Algorithm: 2048-bit RSA key
Version: 3
Extensions:
#1: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: DA A8 38 36 C2 61 E3 CB DF 66 72 B5 FF D6 B7 F8 ..86.a...fr.....
0010: 92 2B 50 81 .+P.
]
]
#2: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
CA:false
PathLen: undefined
]
#3: ObjectId: 2.5.29.37 Criticality=false
ExtendedKeyUsages [
clientAuth
serverAuth
]
#4: ObjectId: 2.5.29.15 Criticality=true
KeyUsage [
DigitalSignature
Non_repudiation
Key_Encipherment
Data_Encipherment
Key_Agreement
]
#5: ObjectId: 2.5.29.17 Criticality=false
SubjectAlternativeName [
DNSName: node1.server.name
DNSName: my.elb.name
]
#6: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: 05 52 D3 51 9B 56 27 EB D2 C1 62 42 A9 43 39 EF .R.Q.V'...bB.C9.
0010: 3A 8E 0D 42 :..B
]
] Make sure you are looking at the PrivateKeyEntry certificate [1]. certificate[2] in the PrivateKeyEntry is the signing certificate. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-06-2023
02:47 PM
@manishg You should only be copying over component nars and not of the core or framework nars. You still want your NiFi running with the newer 1.22. core and framework. None of the component nars should be dependent on a specific core or framework nar version. So start by only including the nars for the components (processors, controller services and reporting tasks) you use and don't copy any other lib. of course the best approach is to test your flows in new version and make adjustments where needed so that it works in 1.22. Copying over older component nars will simply provide multiple version of the same components in the list of available components in NiFi. You lose out on component enhancement, bug fixes, and security improvements. Additionally, you are likely to run in to more issue down the road as you upgrade to an even newer version. Is your plan to keep adding more nars with each upgrade (1.10, 1.22, 1.23, etc.)? On startup NiFi loads the flow.json.gz and looks for the component class defined in the flow.json.gz with version defined in flow.json.gz. If it does not find that version AND only finds one other version fo the same component class, NiFi will auto switch to using that available version of the component. As soon as you have multiple versions avialable and noe match what is in flow.json.gz at startup, NiFi will NOT pick a new one and instead instantiate a "ghost" processor on the canvas. So you run more risks by not fixing/updating your dataflows to work in the newer version. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-06-2023
12:12 PM
1 Kudo
@manishg I strongly recommend testing our and updating your templates with the new release rather then adding in nars from older releases to the lib folder of a newer release. Adding old nars will lose and fixes, improvements or security related changes addressed with those old nars. Doing this is not really addressing yoru issues with yoru templates, but rather "kicking the can down the road". You will eventually need to take actions. Also as a heads up since you mentioned "templates"... NiFi "templates" have been deprecated in favor of the newer "flow definitions" that can be created/downloaded. The "templates" functionality is going away completely with the upcoming Apache NiFi 2.0 release. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-06-2023
12:05 PM
@MukaAddA Sorry, writing such script is not a strong area for me. I just happened to notice you were doing a session.create instead of a session.get. You may get better help by raising a new question on how to create a script to be executed by the ExecuteScript processor to accomplish your use case and provide details on that use case. I am sure there are others in the community that are good at writing such scripts. Matt
... View more
08-30-2023
09:03 AM
@MukaAddA Your issue is 100% in your script. The upstream queued FlowFile is being used as the trigger for execution of the ExecuteScript processor. Instead of reading the upstream FlowFile in your script, you are creating a new FlowFile. So I think what is happening here you have both the new FlowFile generated by your script along with the original FlowFile being passed to the downstream Success relationship. You may find the following article helpful: https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-2/ta-p/249018 If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
08-24-2023
12:45 PM
@mslnrd This is likely caused by LDAP on 636 uses referrals that can your initial query can be referred to across the entire domain tree across multiple LDAP servers. So somewhere within that referral your issues arrises in the hostname verification. Switching to the global catalog port 3269 and there are no referrals. I can't speak to the issues within your ldaps servers causing the issue within the referrals, but makes sense why switching to the secure global catalog port resolved your issue. Hope this clarifies why the change in port resolved your issue. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
08-24-2023
12:31 PM
@kothari It is not Ranger's job to inform the client applications using Ranger what users belong to what group. Each client application is responsible for determining which groups the user authenticated into that service belong to. The policies generated by Ranger are downloaded by the client applications. Within that downloaded policy json will be a resource identifier(s), list if user identities authorized (read, write, and/or delete) , and list of group identities authorized (read, write, or delete) against each resource identifier. So when client checks the downloaded policies from Ranger it is looking for the user identity being authorized and if client is aware of the group(s) that user belongs to, will also check authorization for that group identity. so in your case, it i s most likely that your client service/application has not been configured with the same user and group association setup in your Ranger service. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more