Member since
06-16-2020
36
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1684 | 06-16-2023 07:23 AM |
10-12-2024
11:11 AM
@SAMSAL - Can you help out with this?
... View more
10-07-2024
11:02 AM
All, I created a new processor using python3.12 in NiFi 2.0.0 M4 release of NiFi. My directory structure looks like this based on the documentation - I don't include anything under bundled dependencies and my MANIFEST.MF looks like this. Manifest-Version: 1.0 Build-Timestamp: 2024-10-07T16:22:20Z Nar-Id: processors-nar Nar-Group: processors Nar-Version: 0.0.2 I zipped everything and created a .nar called TransformOpenskyStates.nar. Once created, I added it to my NiFi_home/extensions directory. After looking in the logs, I see this - 2024-10-07 13:44:34,481 INFO [NAR Auto-Loader] org.apache.nifi.nar.NarAutoLoaderTask Found ./extensions/TransformOpenskyStates.nar in auto-load directory 2024-10-07 13:44:39,487 INFO [NAR Auto-Loader] org.apache.nifi.nar.StandardNarLoader Starting load process for 1 NARs... 2024-10-07 13:44:39,494 INFO [NAR Auto-Loader] org.apache.nifi.nar.StandardNarLoader Creating class loaders for 1 NARs... 2024-10-07 13:44:39,496 INFO [NAR Auto-Loader] org.apache.nifi.nar.NarClassLoaders Loaded NAR file: /Users/drewnicolette/Downloads/nifi-2.0.0-M4/./work/nar/extensions/TransformOpenskyStates.nar-unpacked as class loader org.apache.nifi.nar.NarClassLoader[./work/nar/extensions/TransformOpenskyStates.nar-unpacked] 2024-10-07 13:44:39,496 INFO [NAR Auto-Loader] org.apache.nifi.nar.StandardNarLoader Successfully created class loaders for 1 NARs, 0 were skipped 2024-10-07 13:44:39,498 INFO [NAR Auto-Loader] o.a.n.n.StandardExtensionDiscoveringManager Loaded extensions for processors:processors-nar:0.0.2 in 2 millis 2024-10-07 13:44:39,499 INFO [NAR Auto-Loader] org.apache.nifi.nar.StandardNarLoader Finished NAR loading process! But I can't find my processor in the UI/Canvas at all. Does anyone know the issue?
... View more
Labels:
- Labels:
-
Apache NiFi
05-23-2024
05:26 AM
@RAGHUY Thank you for your response. I tried manually resetting the policies and also resetting the cache for hbase but still getting the same thing. I am not making changes to any hbase policies because that is using a group for authorization. I am only making changes to ldap membership group in ldap. Then I rerun user sync and I see the user has been removed from that group in ranger admin UI. However I am still able to get acccess to the table the first time but the second time I can’t it works. The opposite happens too when I add a user to a group the first time it doesn’t work and the second time it works. I the HBase rest server logs I am seeing unauthorized and some issues with Kerberos thread unexpectedly exiting. Do you think that has something to do with it?
... View more
05-22-2024
04:58 PM
I have a cloudera cluster up and running. I have Knox which forwards requests to webhbase and hbase uses ranger for authorization. Ranger is connected to FreeIPA LDAP and we use kerberos internally for authentication. In Ranger, I have a policy that gives access to read a table. In that same ranger policy a group in my freeIPA instance has the ability to access that table. I am having an issue. When I remove a member from that group in freeIPA and rerun ranger user sync, I try a curl call to get data from the table it still works. However when I run it again it gives the expected result as a denial. This has been a consistent pattern for all changes to user membership in ldap groups. The Ranger HBase Policy Sync is working as expected and the Usersync is working as expected as well. After I run Usersync, I confirm that the user has been removed from that group in Ranger but it's still allowing me access. Does anyone know why this is? There are similar Kafka and HDFS policies where I try access those resources as well using the same group and they work the first time but for HBase, it's taking two calls for it to work correctly. Any help would be greatly appreciated!
... View more
Labels:
10-11-2023
10:52 AM
Maybe you could reset the state via NiFi-REST-API at beginning of your flow or separately on a schedule cron every morning Could be: POST "https://[ip:port]/nifi-api/processors/${processor-id}/state/clear-requests" This is the request which NiFi itself uses when you go to the processor in the UI and choose the menu-option "view state" -> "clear state".
... View more
09-26-2023
11:24 AM
@Abhiram-4455 What's the input look like?
... View more
06-28-2023
06:19 AM
I am looking in the Kafka policies in my current Ranger Instance. There is a policy called "service_all - cluster". When I look here are the two allow conditions for this policy - However, when I run this API call to get all the policies for kafka and search for the "service_all - cluster" this is result - <policies>
<id>11</id>
<guid>dbbd8ed1-2bc6-452d-991e-28082727e3cf</guid>
<isEnabled>true</isEnabled>
<version>1</version>
<service>cm_kafka</service>
<name>service_all - cluster</name>
<policyType>0</policyType>
<policyPriority>0</policyPriority>
<description>Service Policy for all - cluster</description>
<isAuditEnabled>true</isAuditEnabled>
<resources>
<entry>
<key>cluster</key>
<value>
<values>*</values>
<isExcludes>false</isExcludes>
<isRecursive>false</isRecursive>
</value>
</entry>
</resources>
<policyItems>
<accesses>
<type>configure</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>describe</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>kafka_admin</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>create</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>idempotent_write</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>describe_configs</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>alter_configs</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>cluster_action</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>alter</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>publish</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>consume</type>
<isAllowed>true</isAllowed>
</accesses>
<accesses>
<type>delete</type>
<isAllowed>true</isAllowed>
</accesses>
<users>cruisecontrol</users>
<users>streamsmsgmgr</users>
<users>kafka</users>
<delegateAdmin>true</delegateAdmin>
</policyItems>
<policyItems>
<accesses>
<type>describe</type>
<isAllowed>true</isAllowed>
</accesses>
<users>rangerlookup</users>
<delegateAdmin>false</delegateAdmin>
</policyItems>
<serviceType>kafka</serviceType>
<options/>
<zoneName/>
<isDenyAllElse>false</isDenyAllElse>
</policies> Here you can see there are 3 extra accesses given called publish, consume, delete that aren't showing up in the user interface. Yesterday I did a whole reimport of all the policies for Kafka and it fixed the issue but after a restart of ranger this happened again. I checked the underlying database and it's consistent with the User Interface but again the API call is adding those three extra accesses. Does anyone know what happens after a restart that is causing the API call to differ from the User Interface?
... View more
Labels:
- Labels:
-
Apache Ranger
06-21-2023
07:14 AM
@cotopaul - It's taking in JSON and writing to Parquet and only doing literal value replacements (ie. adding 5 fields to each record). 3 of those fields is just adding in attribute values and literal values to each record and the other two is doing minor date manipulation (ie converting dates to epoch).
... View more
06-21-2023
05:19 AM
@steven-matison - Thanks for response. If we were to just scope it to looking at the UpdateRecord processor for example, are there any things from an infrastrucutre or configuration stand point you know of to make it more efficient assuming that I can't scale up or tune processor concurrency?
... View more
06-20-2023
05:54 AM
I have been using multiple record oriented processors ConvertRecord, UpdateRecord, etc. in various parts of my flow. For example, my UpdateRecord processor takes about 16 seconds to read in a 30MB flowfile, add some fields to each record and convert that data to parquet. I want to improve performance such that this takes a lot less time. My infrastructure that I am working on currently is a 2 node e2-standard-4 cluster in GCP with a Centos7 operating system. These two instances have 4 vCPUs and 16 GB RAM and for each repository (content, provenance, flowfile) I have separate SSD persistent drives. A lot of the configs in NiFi are the defaults but what recommendations would anyone recommend either from an infrastructure or NiFi standpoint to improve performance on these processors.
... View more
Labels:
- Labels:
-
Apache NiFi