Member since
03-01-2016
45
Posts
78
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1569 | 01-19-2018 10:46 PM | |
4137 | 01-18-2017 08:09 PM | |
5156 | 11-21-2016 10:15 PM | |
3050 | 11-07-2016 02:09 AM | |
3173 | 11-04-2016 09:31 PM |
07-13-2018
02:31 AM
@sunile.manjee was looking at the code in Ambari below specific to metric alerts and it seems as if that default_port value is not being used as in other alert types https://github.com/apache/ambari/blob/branch-2.6/ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py#L87-L110 Function used to extract uri appears to only use the default port if uri isn't defined: https://github.com/apache/ambari/blob/branch-2.6/ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py#L331-L389 Also in other examples I've seen the uri is parameterized with a value from Ambari configuration. And when looking at the ambari config the port value is actually included with the uri and not separated out . See example in Create Custom Alert JSON file" section here: https://community.hortonworks.com/articles/143762/how-to-create-a-custom-ambari-alert-and-use-it-for.html . It may be worthwhile to try including the the port value with the uri
... View more
04-23-2018
01:18 PM
1 Kudo
Hi @Leo
Gallucci, I know in the Apache NiFi community there has been conversations around Kubernetes but within HDF engineering we are definitely focused on it. I don't have a timeline but we are looking to support NiFi on Kubernetes in a release very soon.
... View more
04-20-2018
08:20 PM
1 Kudo
Hi @Raj ji, You can try uninstalling the mpack with the following command: ambari-server uninstall-mpack --mpack-name=hdf-ambari-mpack --verbose And then installing the appropriate management pack ambari-server install-mpack --mpack=hdf-ambari-mpack-3.0.2.0-76.tar.gz --verbose Make sure to stop ambari first and then start after you've installed the correct mpack.
... View more
01-20-2018
07:12 PM
Thanks @bsaini, I've made the correction. Glad this worked for you!
... View more
01-19-2018
10:46 PM
1 Kudo
Hi @Changwon Choi , As @Jay Kumar SenSharma mentioned the important part was updating the base url first. The problem is that without that update Ambari downloaded the older version of the registry (and other components). You can check this by doing the following on each host that has registry components installed: 1) Confirm that the correct version of registry was installed yum list info | grep registry This command should show the installed repositories for registry as shown below: registry_3_0_2_0_76 If that version is not in the list then perform the following (on each host missing correct version). If it is installed skip to step 2. a) Verify that /etc/yum.repos.d/ambari-hdp-*.repo has the updated HDF BASE URL (* denotes number 1 or two depending on your setup). If it does not, you will need to manually edit this file to set the correct BASE URL for HDF b) Install the newest library yum install -y registry_3_0_2_0_76* c) Ensure the stack recognizes the latest version hdf-select set registry 3.0.2.0-76 d) Continue to step 2 to check stack has the appropriate version selected 2) Check the version of the component recognized by the stack: hdf-select status | grep registry This should show registry as the following: registry - 3.0.2.0-76 If the version is not accurate (e.g. showing 3.0.0.0-453) then perform step 1c above and check the version again. Once the above is performed on all hosts affected, return to Ambari and restart Registry. One thing to note is that the above can be performed to manual correct other HDF components (e.g. nifi, SAM as well) Please let me know if this helps!
... View more
11-07-2017
10:26 PM
Yes that is correct, I apologize for not being clear. The Ambari code in the HDF management pack that manages the NiFi service has a reliance on an Ambari configuration that is no longer available in Ambari 2.6.0.
... View more
11-07-2017
06:56 PM
Yes the NiFi service has a reliance on an Ambari configuration that is no longer available in Ambari 2.6.0. The upcoming release removes that dependency.
... View more
11-06-2017
04:25 PM
The problem you experienced is because the current version of HDF 3.0.1.1 is only compatible with Ambari 2.5.1 (see the Component Availability Matrix here https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1.1/bk_release-notes/content/ch_hdf_relnotes.html#repo-location). There is an upcoming release of HDF that will address this problem for 2.6.0.
... View more
11-06-2017
03:53 PM
Hi Raymond, what version of Ambari is your cluster running?
... View more
06-15-2017
10:49 PM
@Johny Travolta Hello! How many nodes do you have in your cluster? If more than one (and you didn't setup NiFi using Ambari) I would just check to make sure that the login-identity-provider.xml file is the same throughout (and has the correct settings). With Ambari (HDF) nodes should be configured with the same settings. My hunch is if you have a clustered environment, one node is configured properly and one or more others may not be. So when a user logs in the rest call behind the scenes may be routed to an improperly configured node. And that could explain why some work and some don't. If just one node I would check the user search base settings that you have and whether some users may not fall in the potential search for LDAP. I hope this helps!
... View more
06-02-2017
03:12 PM
4 Kudos
With the latest release of Apache NiFi 1.2.0 the JoltTransformJson Processor
became a bit more powerful with an upgrade to the Jolt library (to version 0.1.0)
and the introduction of expression language (EL) support. This now provides users the ability to create
dynamic specifications for JSON transformation and to perform some data
manipulation tasks all within the context of the processor. Internal caching
has also been added to improve overall performance.
Let’s take an example of transformation Twitter json payload
seen below:
{"created_at":"Wed Mar 29 02:53:48 +0000 2017","id":846918283102081024,"id_str":"846918283102081024","text":"CSUB falls to Georgia Tech 76-61 in NIT semifinal game. @Bakersfieldcali @BVarsityLive @CSUBAthletics @CSUB_MBB\u2026 https:\/\/t.co\/9e5dQesIbg","display_text_range":[0,140],"source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":2918922812,"id_str":"2918922812","name":"Felix Adamo","screen_name":"tbcpix","location":"Bakersfield Californian","url":null,"description":"Newspaper Photographer","protected":false,"verified":false,"followers_count":677,"friends_count":247,"listed_count":12,"favourites_count":1366,"statuses_count":3576,"created_at":"Thu Dec 04 18:46:27 +0000 2014","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/570251877397180416\/jL2kuB4f_normal.png","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/570251877397180416\/jL2kuB4f_normal.png","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/2918922812\/1483041284","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":"CSUB falls to Georgia Tech 76-61 in NIT semifinal game. @Bakersfieldcali @BVarsityLive @CSUBAthletics @CSUB_MBB @csubnews https:\/\/t.co\/yV2AHFdVLc","display_text_range":[0,121],"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"Bakersfieldcali","name":"The Bakersfield Cali","id":33055408,"id_str":"33055408","indices":[56,72]},{"screen_name":"BVarsityLive","name":"BVarsityLive","id":762418351,"id_str":"762418351","indices":[73,86]},{"screen_name":"CSUBAthletics","name":"CSUB Athletics","id":51115996,"id_str":"51115996","indices":[87,101]},{"screen_name":"CSUB_MBB","name":"\ud83c\udfc0CSUB Men's Hoops\ud83c\udfc0","id":2897931481,"id_str":"2897931481","indices":[102,111]},{"screen_name":"csubnews","name":"CSU Bakersfield","id":209666415,"id_str":"209666415","indices":[112,121]}],"symbols":[],"media":[{"id":846918121248047104,"id_str":"846918121248047104","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"medium":{"w":1200,"h":608,"resize":"fit"},"large":{"w":2048,"h":1038,"resize":"fit"},"small":{"w":680,"h":345,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}},{"id":846918179397906433,"id_str":"846918179397906433","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"large":{"w":2048,"h":1213,"resize":"fit"},"medium":{"w":1200,"h":711,"resize":"fit"},"small":{"w":680,"h":403,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}}]},"extended_entities":{"media":[{"id":846918121248047104,"id_str":"846918121248047104","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"medium":{"w":1200,"h":608,"resize":"fit"},"large":{"w":2048,"h":1038,"resize":"fit"},"small":{"w":680,"h":345,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}},{"id":846918179397906433,"id_str":"846918179397906433","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"large":{"w":2048,"h":1213,"resize":"fit"},"medium":{"w":1200,"h":711,"resize":"fit"},"small":{"w":680,"h":403,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}}]}},"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/9e5dQesIbg","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/846918283102081024","display_url":"twitter.com\/i\/web\/status\/8\u2026","indices":[113,136]}],"user_mentions":[{"screen_name":"Bakersfieldcali","name":"The Bakersfield Cali","id":33055408,"id_str":"33055408","indices":[56,72]},{"screen_name":"BVarsityLive","name":"BVarsityLive","id":762418351,"id_str":"762418351","indices":[73,86]},{"screen_name":"CSUBAthletics","name":"CSUB Athletics","id":51115996,"id_str":"51115996","indices":[87,101]},{"screen_name":"CSUB_MBB","name":"\ud83c\udfc0CSUB Men's Hoops\ud83c\udfc0","id":2897931481,"id_str":"2897931481","indices":[102,111]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1490756028329"}
In our case we want to accomplish several things when transforming this data in JoltTransformJson:
Create a subset of json data that contains id,
tweet text, in reply to fields and a new flow_file_id field
Match the “id” variable in the twitter payload based
on flow file variable and convert that
to a new label (tweet_id)
Set my tweet text to all lower case Set some default values for in reply to fields
that are null
Add flow file unique id to json data
Once the data has been transformed it will land on the file system as well as within a Mongo db repository.
Basic Flow of Twitter Data Transformation and Storage
Here's a close up of the specification in use:
[{
"operation": "shift",
"spec": {
"${id.var}": "tweet_id",
"text": "tweet_text",
"in_reply_to_*": "&"
}
},{
"operation": "modify-overwrite-beta",
"spec": {
"tweet_text": "=toLower"
}
},{
"operation": "modify-default-beta",
"spec": {
"~in_reply_to_status_id": 0,
"~in_reply_to_status_id_str": "",
"~in_reply_to_user_id": "",
"~in_reply_to_user_id_str": 0,
"~in_reply_to_screen_name": ""
}
},{
"operation": "default",
"spec":{
"flow_file_id" : "${uuid}"
}
}]
In the above you’ll see we’ve this accomplished with a
chain specification containing four operations (shift, modify-overwrite,
modify-default,
and default). The shift
helps to define the fields needed for the final schema and translates those
fields into new labels. Note the shift’s
specification uses expression language on the left side (${id.var}) that will
evaluate to a value populated by the UpdateAttribute processor (this value could
also be populated from the Variable Registry). The Jolt library will then
attempt to match that value to the corresponding label in the incoming json
data and change it to the new label (in this case “tweet_id”) on the
right.
The next operation uses modifier-overwrite to ensure that
for all the tweet text coming in we apply the Jolt lower case function to that
data. We then use a modifier-default operation that applies default values to the
in_reply_to fields if those values are null.
Finally we use a basic default operation to create the new flow_file_id
field by applying expression language on the right of the field name to
dynamically create the flow file id entry.
JoltTransformJson Advanced UI with Chain Specification
New Test Attributes Modal for testing Expression Language used in Specifications
The Advanced UI (shown above) has also been enhanced to allow testing of
specifications with expression language (specifically to provide test
attributes that need to be resolved during testing). This gives users greater insight into how a
flow will behave without relying on any external dependencies such as flow file
attributes or variable registry entries.
Example of Transformed JSON (shown in Provenance)
Looking to give this a try? Feel free to download the example template on
GitHub Gist
here and import it into NiFi. The template includes the
specification described above which you can tweak and test out various
scenarios. Also if you have any questions about transforming JSON in Apache NiFi with
Jolt please comment below or reach out to the community on the Apache NiFi mailing list.
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- How-ToTutorial
- jolt
- jolttransformjson
- json
- NiFi
Labels:
04-28-2017
06:24 PM
@John Daniels I haven't seen a problem like this personally but one thing I'd suggest is to ensure that the permissions on that file allow the ambari user to read the file. The other thing I'd try is actually reading that file (which is text/xml based) to make sure it's not corrupted in any way.
... View more
01-23-2017
04:51 PM
Glad that worked! Concerning group permission definitely a known issue, don't believe there's a public work ticket that you can follow.
... View more
01-20-2017
07:33 PM
Hi @Oliver Fletcher! Great work making it this far. Ok here's the challenge. Unfortunately right now Ranger-NiFi plugin doesn't support groups in Ranger. This is a known issue and I believe there is work pending to address it. I see you do have a user entry of oliver, however is the username set to oliver@NIFI.LOCAL ? Based on your logs that is what NiFi is expecting to find.
... View more
01-19-2017
02:47 PM
Lastly concerning the policies defined. If you could post a screen shot of what you have defined that would be helpful for me to troubleshoot as well.
... View more
01-19-2017
02:45 PM
Another thought on Solr. That actually lives behind the scenes of Ambari Infra. If you enabled auditing for the Ranger-NiFi plugin it should have populated configuration to use Solr that's behind Ambari Infra for logging (I believe it populates those values by default) . If you could post what you have configured for ranger-nifi-audit properties that would be easier for me to determine for sure.
... View more
01-19-2017
02:26 PM
1 Kudo
Ok good progress so far! One thing that stands out is the Owner for Certificate (DN) used by Ranger. The nifi log posted appears to show that "CN=ranger-1, OU=Nifi, O=GR, L=London, ST=Unknown, C=Unknown" doesn't have access. I'm assuming that is the actual DN of the certificate used by Ranger. However in the ranger-nifi-plugin-properties section the Owner for Certificate value appears as "CN=ranger-1, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown". Nifi is expecting to identify and authorize Ranger by that value, however it doesn't appear that is the actual Owner info. You should be able to update to the correct value using Ambari. So I suggest changing the owner.for.certificate in ranger-nifi-plugin-properties to match the actual value "CN=ranger-1, OU=Nifi, O=GR, L=London, ST=Unknown, C=Unknown" as described in Part 2, Step 3 i) on the community document. Just update that one field, save the configuration and restart NiFi. Behind the scenes the authorizers.xml configuration file for nifi should be updated with the values for Ranger Admin Identity. And that's what NiFi will use to identify when Ranger is attempting communication.
... View more
01-18-2017
10:21 PM
Going back to one of your early responses I think you said you saw two entries in your nifi truststore? I don't think you needed to clear those out; having two entries, that aren't duplicates shouldn't be a problem. The first entry may have been for that specific node or root CA if you used toolkit (I will need to research to check). The second would be your ranger cert.
... View more
01-18-2017
09:14 PM
The ranger-nifi-plugin-properties is actually used to configure the NiFi service repository in Ranger (the snapshot shown in Part 3 Step 2.) . Those settings help Ranger to be able to reach a secured NiFi in order to look up available rest endpoints that can be secured. When users initially enable the plugin in Ambari, update those values and choose to restart NiFi, Ambari will actually create the service repo populated with those values. The current challenge is when the Ranger plugin is enabled first without SSL settings. If a user goes back to add settings for SSL via Ambari unfortunately the api in Ranger doesn't support update of those fields by Ambari (which is why I suggested checking those settings directly in Ranger). I believe this is a known issue that has been logged (I'll confirm though). The ranger-nifi-policymgr-ssl contains the settings that lives on the NiFi host (in a java credential file) which NiFi uses to talk to Ranger in order to retrieve policies that were configured and store them in it's local cache. Usually any issues with NiFi attempting to communicate with Ranger appear in the nifi-app.log. Also in Ranger you'll be able to see if the particular node connected or not from the Audit/Plugin tab. I hope this makes sense. I'll review the document as well to see if I can make this a bit clearer.
... View more
01-18-2017
08:46 PM
I think you did it for keystores created for nifi but just wanted to check that both the keystore password and the key passwords are the same value for the key/truststores created for ranger?
... View more
01-18-2017
08:25 PM
Another thing I'd suggest is to confirm that both the keystore/truststore that you've created for Ranger to use are accessible. I would manually run a keytool -list command: e.g. keytool -list -v -keystore /etc/security/ranger-certs/keystore.jks using the password you used to create the files. I'd run it on both the truststore and the keystore to confirm they are configured properly.
... View more
01-18-2017
08:16 PM
To add I'm concerned that the settings you need for Ranger to communicate securely with NiFi are not in place. Referring to https://community.hortonworks.com/articles/60001/hdf-20-integrating-secured-nifi-with-secured-range.html if you go to section 3, please confirm that you see the entries described in step 1 & 2. If not you can enter the information directly. Unfortunately Ranger doesn't currently allow us to update that setting through Ambari after it's initially created using Ambari.
... View more
01-18-2017
08:09 PM
1 Kudo
Hi @Oliver Fletcher, What configuration do you have for the ranger_nifi_plugin_properties? Also which logs did you see this error (Ranger or NiFi)?
... View more
11-22-2016
07:34 PM
Awesome @Mark Nguyen glad that worked out!
... View more
11-22-2016
02:28 AM
Ok looking at that exception I also see the "InvalidLoginCredential" exception that is related to NiFi determining that the credentials you provided are invalid. I'm guessing you've confirmed your credentials but just in case please confirm that your credentials are valid against the AD you are pointing to in the login-identity-providers.xml. Also I'd recommend checking that the User Search Base and User Search Filter you are using are appropriate for your AD setup. Here is an article providing details on ldap setup just in case: https://community.hortonworks.com/articles/7341/nifi-user-authentication-with-ldap.html
... View more
11-21-2016
10:15 PM
Hi @Mark Nguyen, At the top of the exception stack it reads: 2016-11-21 21:13:46,548 INFO [NiFi Web Server-20] o.a.n.w.a.c.IllegalArgumentExceptionMapper java.lang.IllegalArgumentException: The supplied username and password are not valid. Did you validate that the credentials you set in the login-identity-provider.xml for the ldap provider file are accurate?
... View more
11-07-2016
02:17 AM
@srini one thing you could try is instead of using that one attribute to bucket on, create another attribute that denormalizes all of the attributes into one string, and use that column to bucket on (still leave the other attributes in place). When you have duplicate columns this would lead to those dupes being bucketed under that one column in the subsequent operation. Then the rest of the operations would pick the unique one and then remove the denormalized column. It's a bit of a dance but I think could work. Does that makes sense? One thing I think would be good to get on the radar is upgrading Jolt in NiFi, perhaps once the modify feature upgrades from beta. I think that will help to simplify some of the hoops needed to do this type of work.
... View more
11-07-2016
02:09 AM
1 Kudo
Hi
@srini,
NiFi does support the addition of custom transformations which can be referenced with a "drop in" jar. You could create your own transformation class, which extends the jolt library, that provides the functionality you need and be available to NiFi on the file system. Given that this is a new function in a later version of Jolt I would be careful using that the as a custom jar; that may cause a conflict but I haven't tested this myself to be sure. Below are some use cases on how to apply custom functions in NiFi
Case 1 – Custom Transform Selected
In this case if the Custom option of Transform is selected then 1) a Custom Transform Class Name should be entered and 2) one or more module paths should be provided. The Custom Transform Class Name should be a fully qualified classname (e.g. eu.zacheusz.jolt.date.Dater). The Module Path can take a comma delimited list of directory locations or one or more jar files. Once these fields are populated the Advanced view will support validation and saving of the specification. A user can switch between transformation types in the UI but not custom class names & module paths.
Case 2 – Chain Transformation Selected with Custom Transformations embedded
In this case if you wants to use one or more transforms (that include custom transformations) in your specification then the Chainr spec option can be used with one or more module path provides. In this case the Custom Transform Class Name property is not required and would be ignored (since one or more custom transformations could be invoked in the spec). As in Case 1 the Advanced view will support validation and saving of specification if the required fields are populated for this case. I hope this helps! Please let me know if you have any questions.
... View more
11-04-2016
09:31 PM
3 Kudos
Hi @srini, Try the following with the Chain specification type selected in the processor: [
{ "operation": "shift", "spec": { "*": { "id": "[&1].id" } } },
{ "operation": "shift", "spec": { "*": "@id[]" } },
{
"operation": "cardinality",
"spec": {
"*": {
"@": "ONE"
}
}
},
{ "operation": "shift", "spec": { "*": { "id": "[].id" } } }
] What I did was add a bucket shift operation which sorts same json entries into "buckets", used cardinality to select just one entry and then another shift for the output you need. Please let me know if this works for you.
... View more
10-11-2016
12:44 PM
15 Kudos
Credits to @mgilman for contributing to this article:
Since the introduction of NiFi 1.0.0 administrators have a greater ability to manage policy through the addition of Ranger Integration and a more granular authorization model. This article provides a guide for those looking to define and manage NiFi policies in Ranger. To learn more on configuring NiFi to use Ranger via Ambari please review the parent article HDF 2.0 - Integrating Secured NiFi with Secured Ranger for Authorization Management.
Behind the scenes NiFi uses a REST based API for all user interaction; therefore resource based policies are used in Ranger to define users' level of permissions when executing calls against these REST endpoints via NiFi's UI . This allows administrators to define policies by selecting a NiFi resource/endpoint, choosing whether users have Read or Write (Modify) permissions to that resource, and selecting users who will be granted the configured permission. For example the image below shows a policy in Ranger where a user is granted the ability to View Flows in NiFi’s interface. This was configured by selecting /flow as the authorized resource and granting the selected user the Read permission to that resource.
Example of Global Level Policy Definition with Kerberos User Principal
Policies can be created that will apply authorizations to features at a global level or on a specific component level in NiFi. The following describes the policies that can be defined in Ranger using a combination of the indicated NiFi Resource and Permission (Read or Write).
Global Policies:
Policy
Privilege
NiFi Resource
Permission(s)
View the user Interface
Allows users to view the user interface
/flow
Read
Access the controller
Allows users to view/modify the controller, including Reporting Tasks, Controller Services, and clustering endpoints. Explicit access to reporting tasks and controller services can be overridden
/controller
Read (for View)
Write (for Modify)
Query Provenance
Allows users to submit a Provenance Search and request Event Lineage. Access to actual provenance events or lineage will be based off the data policies of the components that generated the events. This simply allows the user to submit the query.
/provenance
Read
Access Users/User Groups
Allows users to view/modify users and user groups
/tenants
Read (View User/Groups)
Write (Modify Users/Groups)
Retrieve Site-To-Site Details
This policy should be granted to other NiFi systems (or Site-To-Site clients) in order to retrieve the listing of available ports (and peers when using HTTP as the transport protocol). Explicit access to individual ports is still required to see and initiate Site-To-Site data transfer.
/site-to-site
Read (Allow retrieval of data)
View System Diagnostics
This policy should be granted in order to retrieve system diagnostic details including JVM memory usage, garbage collection, system load, and disk usage.
/system
Read
Proxy User Requests
This policy should be granted to any proxy sitting in front of NiFi or any node in the cluster that will be issuing requests on a user's behalf.
/proxy
Write (granted on Node Users defined in Ranger)
Access Counters
This policy should be granted to users to retrieve and reset counters. This policy is separated from each individual component has the counters can also be rolled up according to type.
/counters
Read (Read counter information)
Write (Reset Counters)
Note: Setting authorizations on the /policy resource is not applicable when using Ranger since NiFi’s policy UI is disabled when Ranger Authorization enabled.
Component Policies
Component Level policies can be set in Ranger on individual components on the flow within a process group or on the entire process group (with the root process group being the top level for all flows). Most components types (except for connections) can have a policy applied directly to it. For example the image below demonstrates a policy defined for a specific processor instance (noted by the unique identifier included in the resource url) which grants Read and Modify permissions to the selected user.
Example of Component Level Policy for Kerberos User Principal If no policy is available on the specific component then it will look to the parent process group for policy information. Below are the available resources for components where a specific policy can be applied to an instance of that component. Detailed information on component descriptions can be found in NiFi Documentation.
Component Type
Resource (Rest API)
Description (from NiFi Documentation)
Controller Service
/controller-services
Extension Point that provides centralized access data/configuration information to other components in a flow
Funnel
/funnels
Combine data from several connections into one connection
Input Port
/input-ports
Used to receive data from other data flow components
Label
/labels
Documentation for flow
Output Port
/output-ports
Used to send data to other data flow components
Processor
/processor
NiFi component that pulls data from or publishes to external sources, or route, transforms or extracts information from flow files.
Process Group
/process-groups
An abstract context used to group multiple components (processors) to create a sub flow. Paired with input and output ports process groups and be used to simplify complex flows and use logical flows
Reporting Task
/reporting-tasks
Runs in the background and provides reporting data on NiFi instance
Template
/templates
Represents a predefined dataflow available for reuse within NiFi. Can be imported and exported
The following table describes the types of policies that can be applied to the previously mentioned components. Note: UUID is the unique identifier of an individual component within the flow.
Policy
Description
REST API
Read or Update Component
This policy should be granted to users for retrieving component configuration details and modifying the component.
Read/Write on:
/{resource}/{uuid}
e.g.
/processor/{uuid}
View Component Data or Allow Emptying of Queues and Replaying
This policy should be granted to users for retrieving or modifying data from a component. Retrieving data will encompass listing of downstream queues and provenance events. Modifying data will encompass emptying of downstream queues and replay of provenance events. Additionally, data specific endpoints will require every link in the request chain is authorized with this policy. Since they will be traversing each link, we need to ensure that each proxy is authorized to have the data.
Read/Write on:
/data/{resource}/{uuid}
Write Receive Data, Write Send Data
These policies should be granted to other NiFi instances and Site-To-Site clients that will be sending/receiving data from the specified port. Once a client has been added to a port specific Site-To-Site policy, that client will be able to retrieve details about this post and initiate a data transfer. Additionally, data specific endpoints will require every link in the request chain is authorized with this policy. Since the will be traversing each link, we need to ensure that each proxy is authorized to have the data.
Write on:
/data-transfer/input-ports/{uuid}
/data-transfer/output-ports/{uuid}
For more information on Authorization configuration with Ranger and NiFi please review
http://bryanbende.com/development/2016/08/22/apache-nifi-1.0.0-using-the-apache-ranger-authorizer https://community.hortonworks.com/articles/57980/hdf-20-apache-nifi-integration-with-apache-ambarir.html
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- hdf
- hdf-2.0.0
- hdf-3.0.0
- hdf-3.1
- hdf-3.2.0
- How-ToTutorial
- NiFi
- policies
- Ranger
- ranger-admin