About MattWho

MattWho · ‎11-12-2018

NiFi Restricted components are those processors, controller services, or reporting tasks that have the ability to run user-defined code or access/alter localhost filesystem data. - The NiFi User guide explains this as follows: ----------------------------------------- Restricted components will be marked with a icon next to their name. These are components that can be used to execute arbitrary unsanitized code provided by the operator through the NiFi REST API/UI or can be used to obtain or alter data on the NiFi host system using the NiFi OS credentials. These components could be used by an otherwise authorized NiFi user to go beyond the intended use of the application, escalate privilege, or could expose data about the internals of the NiFi process or the host system. All of these capabilities should be considered privileged, and admins should be aware of these capabilities and explicitly enable them for a subset of trusted users. Before a user is allowed to create and modify restricted components they must be granted access. ------------------------------------------ Users can only be restricted from adding such components in NiFi if NiFi has to be secured. Users of an unsecured NiFi will always have access to all components. - Prior to HDF 3.2 or Apache NiFi 1.6, all restricted components were covered by a single authorization policy: Ranger Policy (Base policies): NiFi Policies (Hamburger menu) Ranger permissions description: /restricted-components Access restricted components Read/View - N/A Write/Modify - Gives granted users the ability to add components to the canvas that are tagged as “restricted” - It was decided that lumping all components into one policy was not ideal. So NIFI-4885 was created to address this so that users' access to restricted components would be based on the level of restricted access they are being granted. read-filesystem read-distributed-filesystem write-filesystem write-distributed-filesystem execute=code access-keytab export-nifi-details - In order to avoid backward compatibility issues when users upgrade to a HDF 3.2+ or Apache NiFi 1.6.0+, the “Access restricted components” base policy still exists and defaults to "regardless of restrictions". In the NiFi global “Access Policies” UI, this is the default policy and is depicted as follows: In Ranger, this is still associated with just the “/restricted-components” policy. The four new policies are depicted as follows in Ranger and NiFi UIs: - Ranger Policy (Base policies): NiFi Policies (Hamburger menu) Ranger permissions description: /restricted-components/read-filesystem Access restricted componentsSub policy:Requiring ‘read filesystem’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring read filesystem. /restricted-components/read-distributed-filesystem Access restricted componentsSub policy:Requiring ‘read distributed filesystem’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring read distributed filesystem. /restricted-components/write-filesystem Access restricted componentsSub policy:Requiring ‘write filesystem’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring write filesystem. /restricted-components/write-distributed-filesystem Access restricted componentsSub policy:Requiring ‘write distributed filesystem’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring write distributed filesystem. /restricted-components/execute-code Access restricted componentsSub policy:Requiring ‘execute code’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring read filesystem. /restricted-components/access-keytab Access restricted components Sub policy:Requiring ‘access keytab’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring read filesystem. /restricted-components/export-nifi-details Access restricted components Sub policy:Requiring ‘export nifi details’ Read/View - N/A Write/Modify - Allows users to create/modify restricted components requiring read filesystem. - Below is a list of restricted components for each of the above sub-policies (current as of CFM 2.1.1 and Apache NiFi 1.13): Read-filesystem: NiFi component: Component type: Access provisions: FetchFile Processor Provides operator the ability to read from any file that NiFi has access to. TailFile Processor Provides operator the ability to read from any file that NiFi has access to. GetFile Processor Provides operator the ability to read from any file that NiFi has access to. - Read-Distributed-Filesystem: (Added NiFi 1.13) NiFi component: Component type: Access provisions: FetchHDFS Processor Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem. FetchParquet Processor Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem. GetHDFS Processor Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem. GetHDFSSequenceFile Processor Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem. MoveHDFS Processor Provides operator the ability to retrieve any file that NiFi has access to in HDFS or the local filesystem. - Write-filesystem: NiFi component: Component type: Access provisions: FetchFile Processor Provides operator the ability to delete any file that NiFi has access to. GetFile Processor Provides operator the ability to delete any file that NiFi has access to. PutFile Processor Provides operator the ability to write to any file that NiFi has access to. - Write-Distributed-Filesystem: (Added NiFi 1.13) NiFi component: Component type: Access provisions: DeleteHDFS Processor Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem. GetHDFS Processor Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem. GetHDFSSequenceFile Processor Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem. MoveHDFS Processor Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem. PutHDFS Processor Provides operator the ability to delete any file that NiFi has access to in HDFS or the local filesystem. PutParquet Processor Provides operator the ability to write any file that NiFi has access to in HDFS or the local filesystem. - Execute-code: NiFi component: Component type: Access provisions: ScriptedReportingTask Reporting Task Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ScriptedLookupService Controller Service Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ScriptedReader Controller Service Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ScriptedRecordSetWriter Controller Service Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ExecuteFlumeSink Processor Provides operator the ability to execute arbitrary Flume configurations assuming all permissions that NiFi has. ExecuteFlumeSource Processor Provides operator the ability to execute arbitrary Flume configurations assuming all permissions that NiFi has. ExecuteGroovyScript Processor Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ExecuteProcess Processor Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ExecuteScript Processor Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. ExecuteStreamCommand Processor Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. invokeScriptedProcessor Processor Provides operator the ability to execute arbitrary code assuming all permissions that NiFi has. - access-keytab: NiFi component: Component type: Access provisions: KeytabCredentialsService Controller Service Allows user to define a Keytab and principal that can then be used by other components. - Export-nifi-details: NiFi component: Component type: Access provisions: SiteToSiteBulletinReportingTask Reporting Task Provides operator the ability to send sensitive details contained in bulletin events to any external system. SiteToSiteProvenanceReportingTask Reporting Task Provides operator the ability to send sensitive details contained in Provenance events to any external system. - ***Note: Some components may be found under multiple sub-policies above. In order for a user to utilize that component, they must be granted access to every sub policy required by that component. - Exceptions in HDF 3.2 and Apache 1.7 and 1.8: In order to use the following components, users must have full access to all restricted components policies: NiFi component: Component type: Access provisions: PutORC Processor This component requires access to restricted components regardless of restriction. Apache Jira: NIFI-5815 - A full breakdown of all other NiFi Policies can be found here: NiFi Ranger based policy descriptions - Cloudera Community

MattWho · ‎11-12-2018

@Louis Allen I believe you have found your problem in your NiFi node certificates: - ExtendedKeyUsages [ serverAuth ] - The NIFi nodes are establishing the connection to the the NIFi-registry to retrieve bucket information. That means NiFi is acting as a client and not a server in the TLS 2-way handshake. Since NiFi nodes have no client certificate to offer to NiFi-Registry for the purpose of authentication and then authorization, the nodes are failing to retrieve the bucket listing. - NiFi certs must support both clientAuth and serverAuth. NiFi registry is just one example of why this is important. Using NiFi's Site-To-Site capability via Remote Process Groups (RPGs) is another example of when clientAuth would be required to send data between to secured NiFis. - You are going to need to generate new certificates for all your NIFi nodes that support both ClientAuth and ServerAuth. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎11-12-2018

@sally sally - The ERROR shows that the endpoint you are try to connect to is a secured https connection to port 443. This means that some form of SSL handshake is going to need to take place which means you will need to enable your invokeHTTP processor to use a SSLContextService. - Does this endpoint require user authentication? If yes, your SSLContextService will require both a keystore and a truststore. If no, your SSLContextService will only require a truststore. - The keystore would contain you client certificate (PrivateKeyEntry) which the endpoint must be able to trust to authenticate your client user who is connecting. If a client certificate is not provided most endpoints will just close the connection. - The truststore is used by your client (NiFi is your client in this scenario) to verify trust of the server certificate present from your endpoint. This truststore must contain the complete certificate trust chain for your target endpoint. You can use the following openssl client command to get output that shows the complete certificate trust chain for your target: - openssl s_client -connect 104.28.29.206:443 - You will see a lot of output, but look for a section for the certificate chain. You may see one or more certificates in the chain. Each certificate will have an owner(o) and issuer(i). Your truststore must contain a "trustedCertEntry" for every one of those issuers. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎11-08-2018

@Louis Allen The Error "Failed to connect node to cluster because local flow controller partially updated." indicates you are having an authentication issue and not an authorization issue (yet). When communicating with a secured Registry, a 2-way TLS authentication is required between NiFi and NiFi-registry. - Use the following command to get the full certificate chain from NiFi and NiFi-registry: #openssl s_client -connect <nifi-node>:<nifi-port>|grep -A 10 chain #openssl s_client -connect <nifi-registry>:<registry-port>|grep -A 10 chain 10 lines should be more then enough to get full chain, but if not you may need to increase this number. - Things to verify on NiFi side: 1. Perform a verbose listing on the keystore used by each nifi-node and verify that each contains only a single "PrivateKeyEntry" and the "PrivateKeyEntry" has an ExtendedKeyUsage that supports both client_auth and serverAuth. 2. Perform a verbose listing on the truststore used by each nifi-node and verify the ever CA returned by the openssl command run against the registry is found as a "TrustedCertEntry". (this may be 1 or more intermediate CAs and a root CA) - Things to verify on NiFi-Registry side: 1. Perform a verbose listing on the truststore used by the nifi-registry and verify the ever CA returned by the openssl command run against each NiFi node is found as a "TrustedCertEntry". (this may be 1 or more intermediate CAs and a root CA) - Each NiFi node should appear in the your list of users and been granted both the "Can proxy user requests" and "READ" access on the "Can manage buckets" policies. If you have not setup and identity mapping patterns in NiFi-registry, the NiFi nodes user names will be the complete node DN. - Thank you, Matt

MattWho · ‎11-07-2018

@Emma Ixiato - *** Community Forum Tip: Try to avoid starting a new answer in response to an existing answer. Instead use comments to respond to existing answers. There is no guaranteed order to different answer which can make it hard following a discussion. - For performance reason the method used to determine what source files are listed is very simplistic in nature and is based off the accuracy of the timestamps on the source files. - Lets assume within a source directory you have a 1 kb file written with a timstamp accurate down to seconds (2018-11-06 16:49:22) which NiFi lists. While that listing is occurring another file is being written to same directory but has not completed being written and is still a "." (hidden) file which list processor ignores by default. The listSFTP processor records a timestamp of the newest listed source file in state. On next execution of ListSFTP, only files with a timestamp newer then "2018-11-06 16:49:22" would be listed. So possibly that other file that was till being written was completed and renamed (to remove ".") within same second of last listing. NiFi would then exclude it in next listing. Another possibility is the system writing the files to the source SFTP server directories is not updating the LastModified timestamps on the "new" files. This resulting in some "new" source files with older timestamps. - If any of this is the case, perhaps https://jira.apache.org/jira/browse/NIFI-5406 that is included in Apache NIFi 1.8.x will help. - Thank you, Matt

MattWho · ‎11-07-2018

@narasimha chembolu - The ListAzureBlobStorage processor is designed to produce a FlowFile for each blob listed from target storage. For each produced FlowFile produced the following attributes are written to the FlowFile: - The FetchAzureBlobStorage processor is triggered to execute by each incoming FlowFile that it receives. It is designed by default to use the value that was assigned to that FlowFile attribute "azure.blobname" by the listAzureBlobStorage to determine which blob to return. - In you case you are only looking to actually fetch a very specific blob, so you configured "Blob' property in the FetchAzure processor to always get a very specific blob. This means that every incoming FlowFile is going to fetch the content of the same blob each time an insert it in to the content of every listed FlowFile. - So your flow is working as designed, but not as you intended. - You have two options: 1. reconfigure your FetchAzureBlobStorage processor to use "${azure.blobname}" in the blob property. Then add a routeOnAttribute processor between the listAzureBlobStorage and FetchAzureBlobStorage processor to filter on the specific blob name you are looking for so that only that listed file makes it to the FetchAzure processor. - 2. Don't use ListAzureStorage processor at all. Instead use a GenerateFlowFile processor to generate single 0 byte FlowFile on the primary node and use it to trigger the FetchAzureStorage processor to fetch the specific blob you want. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎11-06-2018

@nisrine elloumi - Since the target NiFi configured in the Remote Process Group has been secured, your MiNiFi agent will need a keystore an truststore as well. The keystore must contain a singls PrivateKeyEntry that will be trusted by a TrustedCertEntry in the truststore of your NiFi. The truststore on your MiNiFi agent must contain a trustecCertEntry that is capable of establishing trust of the certificate coming from the target NiFi. - If your target NiFi has been secured using the the NiFi CA, you can use the TLS-toolkit to generate a new server certificate for your MiNiFi agent. Below can be found in the NiFi Admin guide: - Client The client can be used to request new Certificates from the CA. The client utility generates a keypair and Certificate Signing Request (CSR) and sends the CSR to the Certificate Authority. The client is invoked by running ./bin/tls-toolkit.sh client -h which prints the usage information along with descriptions of options that can be specified. You can use the following command line options with the tls-toolkit in client mode: -a , --keyAlgorithm <arg> Algorithm to use for generated keys (default: RSA ) -c , --certificateAuthorityHostname <arg> Hostname of NiFi Certificate Authority (default: localhost ) -C , --certificateDirectory <arg> The directory to write the CA certificate (default: . ) --configJsonIn <arg> The place to read configuration info from, implies useConfigJson if set (default: configJson value) -D , --dn <arg> The DN to use for the client certificate (default: CN=<localhost name>,OU=NIFI ) (this is auto-populated by the tool) -f , --configJson <arg> The place to write configuration info (default: config.json ) -F , --useConfigJson Flag specifying that all configuration is read from configJson to facilitate automated use (otherwise configJson will only be written to) -g , --differentKeyAndKeystorePasswords Use different generated password for the key and the keystore -h , --help Print help and exit -k , --keySize <arg> Number of bits for generated keys (default: 2048 ) -p , --PORT <arg> The port to use to communicate with the Certificate Authority (default: 8443 ) --subjectAlternativeNames <arg> Comma-separated list of domains to use as Subject Alternative Names in the certificate -T , --keyStoreType <arg> The type of keystores to generate (default: jks ) -t , --token <arg> The token to use to prevent MITM (required and must be same as one used by CA) After running the client you will have the CA’s certificate, a keystore, a truststore, and a config.json with information about them as well as their passwords. - You will need to collect the needed information about your CA. You could run this toolkit right on the server where the NiFi CA was installed (for example from /tmp directory). Then you just need to move the produced keystore and truststore files over to your MiNiFi agent and configured the yaml file to use them. - If you are using a keystores and truststores generated by some other CA, you will need to follow whatever procedure that authority to generate the new keystore and truststore needed for your MiNiFi agent. Just make sure the criteria I outlined at the beginning is being met so that the 2-way TLS handshake will be successful. - Thank you, Matt - If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.

MattWho · ‎11-05-2018

@Diego A Labrador - In addition to making sure that your user has been granted "view the data", you will want to grant the same policy to all your NiFi nodes. When a user is logged in to node1 for example and request to list a queue. That request is replicated to all nodes. The other nodes will return the listing results to the node where the request originated. If the originating nodes has not been granted permissions to see data on other nodes it will not get displayed. - Thank you, Matt

MattWho · ‎10-31-2018

@sri chaturvedi That specific property has to do with how long NIFi will potentially hold on to content claims which have been successfully moved to archive. It works in conjunction with the "nifi.content.repository.archive.max.usage.percentage=50%" property. - Neither of these properties will result in the clean-up/removal of any content claims that have not been moved to archive. If the disk usage of the content repository disk has exceeded 50% the "archive" sub-directories in the content repo should all be empty. - This does not mean that the active content claims will not continue to increase in number until content repo disk is 10% full. - Nothing about how the content repository works will have any impact on NiFi UI performance. A full content repository will howvere impact the overall performance of your NiFi dataflows. - Hope this answers your question.

MattWho · ‎10-31-2018

NiFi's content repository will hold on to claims until there are no active FlowFiles still anywhere on the canvas referencing that claim. This can result in very old or very large claims being left in the content repo using up space. Something as small as a single 0 byte FlowFile that sitting in some queue may end up preventing a multi-gigabyte content claim from being removed. The intent of this article is help users understand how to find those active FlowFiles and clear them form your dataflow. - There is no simple UI feature or NiFi rest-api endpoint users can use to return information linking FlowFiles to exiting content claims; however, all is not lost. Users can add an additional indexed field to your provenance configuration to so that is starts indexing the "ContentClaimIdentifier" associated with each FlowFile event generated. This would give you the ability to use Provenance to identify all the FlowFiles associated to specific claim in the content repo. - To add this, simply add "ContentClaimIdentifier" to the list if existing indexed fields via the nifi.properties file. Here is the specific property you will be editing: - nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship, ContentClaimIdentifier - NOTE: a restart of NiFi will need to occur before indexing of this new field will begin. NiFi will not go back and re-index existing events. It will only start adding this new indexed field to events created from this point forward. - You can then search your content repo for any content claim files that are of concern (for example searching for any very old or very large claims). Simply copy the claim number (filename of claim) and search for it via a provenance query. Once you have your list of FlowFile events all tied to same claim: - Above is example provenance query which returned below three FlowFiles: I would then need to look at lineage of each of those FlowFiles to see if any of them made it to a DROP event (A DROP event means it is no longer in your dataflows anywhere. Lineage can be displayed by clicking the "show lineage" icon to the right of any event from list . - Once you have found one or more with no DROP event, you will want to get details of last event by right clicking on it: From those details you can collect the Component ID of for the processor that produced that event. - - Go back to your main canvas and search on that component ID. Your FlowFile will be located in one of the outbound relationships to that component. - - As you can see my FlowFile was found to be in this "success" relationship for an UpdateAttribute processor. You will also notice that this connection contains many FlowFiles. If I only want to purge this one FlowFile I am going to need to insert a RouteOnAttribute processor to this flow that I configure to route only a FlowFile with a specific FlowFile UUID which I could also get from my provenance event details above. - The routeOnAttribute processor added property would as simple as: Property: purgeValue: ${uuid:equals('8297d10c-f6ca-4843-9593-320e5b265dd6')} - "purge" becomes the new relationship which you can then just auto-terminate. The "unmatched" relationship will get routed on in your flow and will contain every other FlowFile from this connection. - *** Please feel free to post any comments to this article of you have questions or see anything that needs to be added/clarified. - Thank you, Matt

Online	Offline
Last Visited	‎12-22-2025 04:05 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎12-22-2025 04:05 PM
Posts	3,406
Kudos received	1618

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

NiFi Restricted Components Policy Descriptions

Re: Getting Unable to obtain listing of buckets: o...

Re: java.net.ConnectException

Re: Getting Unable to obtain listing of buckets: o...

Re: FetchSFTP failed to fetch all file in the list...

Re: Fetching file using FetchAzureBlobStorage Nifi...

Re: Minifi SSLHandshake

Re: I cannot see the data in the queue, I'm in Nif...

Re: Understanding how NiFi's Content Repository Ar...

How to determine which FlowFiles are associated to...