Member since
03-01-2016
45
Posts
77
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
740 | 01-19-2018 10:46 PM | |
1914 | 01-18-2017 08:09 PM | |
1696 | 11-21-2016 10:15 PM | |
1199 | 11-07-2016 02:09 AM | |
1160 | 11-04-2016 09:31 PM |
07-13-2018
02:31 AM
@sunile.manjee was looking at the code in Ambari below specific to metric alerts and it seems as if that default_port value is not being used as in other alert types https://github.com/apache/ambari/blob/branch-2.6/ambari-agent/src/main/python/ambari_agent/alerts/metric_alert.py#L87-L110 Function used to extract uri appears to only use the default port if uri isn't defined: https://github.com/apache/ambari/blob/branch-2.6/ambari-agent/src/main/python/ambari_agent/alerts/base_alert.py#L331-L389 Also in other examples I've seen the uri is parameterized with a value from Ambari configuration. And when looking at the ambari config the port value is actually included with the uri and not separated out . See example in Create Custom Alert JSON file" section here: https://community.hortonworks.com/articles/143762/how-to-create-a-custom-ambari-alert-and-use-it-for.html . It may be worthwhile to try including the the port value with the uri
... View more
04-23-2018
01:18 PM
1 Kudo
Hi @Leo
Gallucci, I know in the Apache NiFi community there has been conversations around Kubernetes but within HDF engineering we are definitely focused on it. I don't have a timeline but we are looking to support NiFi on Kubernetes in a release very soon.
... View more
04-20-2018
08:20 PM
1 Kudo
Hi @Raj ji, You can try uninstalling the mpack with the following command: ambari-server uninstall-mpack --mpack-name=hdf-ambari-mpack --verbose And then installing the appropriate management pack ambari-server install-mpack --mpack=hdf-ambari-mpack-3.0.2.0-76.tar.gz --verbose Make sure to stop ambari first and then start after you've installed the correct mpack.
... View more
01-20-2018
07:12 PM
Thanks @bsaini, I've made the correction. Glad this worked for you!
... View more
01-19-2018
10:46 PM
1 Kudo
Hi @Changwon Choi , As @Jay Kumar SenSharma mentioned the important part was updating the base url first. The problem is that without that update Ambari downloaded the older version of the registry (and other components). You can check this by doing the following on each host that has registry components installed: 1) Confirm that the correct version of registry was installed yum list info | grep registry This command should show the installed repositories for registry as shown below: registry_3_0_2_0_76 If that version is not in the list then perform the following (on each host missing correct version). If it is installed skip to step 2. a) Verify that /etc/yum.repos.d/ambari-hdp-*.repo has the updated HDF BASE URL (* denotes number 1 or two depending on your setup). If it does not, you will need to manually edit this file to set the correct BASE URL for HDF b) Install the newest library yum install -y registry_3_0_2_0_76* c) Ensure the stack recognizes the latest version hdf-select set registry 3.0.2.0-76 d) Continue to step 2 to check stack has the appropriate version selected 2) Check the version of the component recognized by the stack: hdf-select status | grep registry This should show registry as the following: registry - 3.0.2.0-76 If the version is not accurate (e.g. showing 3.0.0.0-453) then perform step 1c above and check the version again. Once the above is performed on all hosts affected, return to Ambari and restart Registry. One thing to note is that the above can be performed to manual correct other HDF components (e.g. nifi, SAM as well) Please let me know if this helps!
... View more
11-07-2017
10:26 PM
Yes that is correct, I apologize for not being clear. The Ambari code in the HDF management pack that manages the NiFi service has a reliance on an Ambari configuration that is no longer available in Ambari 2.6.0.
... View more
11-07-2017
06:56 PM
Yes the NiFi service has a reliance on an Ambari configuration that is no longer available in Ambari 2.6.0. The upcoming release removes that dependency.
... View more
11-06-2017
04:25 PM
The problem you experienced is because the current version of HDF 3.0.1.1 is only compatible with Ambari 2.5.1 (see the Component Availability Matrix here https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.0.1.1/bk_release-notes/content/ch_hdf_relnotes.html#repo-location). There is an upcoming release of HDF that will address this problem for 2.6.0.
... View more
11-06-2017
03:53 PM
Hi Raymond, what version of Ambari is your cluster running?
... View more
06-15-2017
10:49 PM
@Johny Travolta Hello! How many nodes do you have in your cluster? If more than one (and you didn't setup NiFi using Ambari) I would just check to make sure that the login-identity-provider.xml file is the same throughout (and has the correct settings). With Ambari (HDF) nodes should be configured with the same settings. My hunch is if you have a clustered environment, one node is configured properly and one or more others may not be. So when a user logs in the rest call behind the scenes may be routed to an improperly configured node. And that could explain why some work and some don't. If just one node I would check the user search base settings that you have and whether some users may not fall in the potential search for LDAP. I hope this helps!
... View more
06-02-2017
03:12 PM
4 Kudos
With the latest release of Apache NiFi 1.2.0 the JoltTransformJson Processor
became a bit more powerful with an upgrade to the Jolt library (to version 0.1.0)
and the introduction of expression language (EL) support. This now provides users the ability to create
dynamic specifications for JSON transformation and to perform some data
manipulation tasks all within the context of the processor. Internal caching
has also been added to improve overall performance.
Let’s take an example of transformation Twitter json payload
seen below:
{"created_at":"Wed Mar 29 02:53:48 +0000 2017","id":846918283102081024,"id_str":"846918283102081024","text":"CSUB falls to Georgia Tech 76-61 in NIT semifinal game. @Bakersfieldcali @BVarsityLive @CSUBAthletics @CSUB_MBB\u2026 https:\/\/t.co\/9e5dQesIbg","display_text_range":[0,140],"source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":true,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":2918922812,"id_str":"2918922812","name":"Felix Adamo","screen_name":"tbcpix","location":"Bakersfield Californian","url":null,"description":"Newspaper Photographer","protected":false,"verified":false,"followers_count":677,"friends_count":247,"listed_count":12,"favourites_count":1366,"statuses_count":3576,"created_at":"Thu Dec 04 18:46:27 +0000 2014","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/570251877397180416\/jL2kuB4f_normal.png","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/570251877397180416\/jL2kuB4f_normal.png","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/2918922812\/1483041284","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"extended_tweet":{"full_text":"CSUB falls to Georgia Tech 76-61 in NIT semifinal game. @Bakersfieldcali @BVarsityLive @CSUBAthletics @CSUB_MBB @csubnews https:\/\/t.co\/yV2AHFdVLc","display_text_range":[0,121],"entities":{"hashtags":[],"urls":[],"user_mentions":[{"screen_name":"Bakersfieldcali","name":"The Bakersfield Cali","id":33055408,"id_str":"33055408","indices":[56,72]},{"screen_name":"BVarsityLive","name":"BVarsityLive","id":762418351,"id_str":"762418351","indices":[73,86]},{"screen_name":"CSUBAthletics","name":"CSUB Athletics","id":51115996,"id_str":"51115996","indices":[87,101]},{"screen_name":"CSUB_MBB","name":"\ud83c\udfc0CSUB Men's Hoops\ud83c\udfc0","id":2897931481,"id_str":"2897931481","indices":[102,111]},{"screen_name":"csubnews","name":"CSU Bakersfield","id":209666415,"id_str":"209666415","indices":[112,121]}],"symbols":[],"media":[{"id":846918121248047104,"id_str":"846918121248047104","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"medium":{"w":1200,"h":608,"resize":"fit"},"large":{"w":2048,"h":1038,"resize":"fit"},"small":{"w":680,"h":345,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}},{"id":846918179397906433,"id_str":"846918179397906433","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"large":{"w":2048,"h":1213,"resize":"fit"},"medium":{"w":1200,"h":711,"resize":"fit"},"small":{"w":680,"h":403,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}}]},"extended_entities":{"media":[{"id":846918121248047104,"id_str":"846918121248047104","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8Dbi0rUwAAiffu.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"medium":{"w":1200,"h":608,"resize":"fit"},"large":{"w":2048,"h":1038,"resize":"fit"},"small":{"w":680,"h":345,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}},{"id":846918179397906433,"id_str":"846918179397906433","indices":[122,145],"media_url":"http:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","media_url_https":"https:\/\/pbs.twimg.com\/media\/C8DbmNTVMAEvpd3.jpg","url":"https:\/\/t.co\/yV2AHFdVLc","display_url":"pic.twitter.com\/yV2AHFdVLc","expanded_url":"https:\/\/twitter.com\/tbcpix\/status\/846918283102081024\/photo\/1","type":"photo","sizes":{"large":{"w":2048,"h":1213,"resize":"fit"},"medium":{"w":1200,"h":711,"resize":"fit"},"small":{"w":680,"h":403,"resize":"fit"},"thumb":{"w":150,"h":150,"resize":"crop"}}}]}},"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/9e5dQesIbg","expanded_url":"https:\/\/twitter.com\/i\/web\/status\/846918283102081024","display_url":"twitter.com\/i\/web\/status\/8\u2026","indices":[113,136]}],"user_mentions":[{"screen_name":"Bakersfieldcali","name":"The Bakersfield Cali","id":33055408,"id_str":"33055408","indices":[56,72]},{"screen_name":"BVarsityLive","name":"BVarsityLive","id":762418351,"id_str":"762418351","indices":[73,86]},{"screen_name":"CSUBAthletics","name":"CSUB Athletics","id":51115996,"id_str":"51115996","indices":[87,101]},{"screen_name":"CSUB_MBB","name":"\ud83c\udfc0CSUB Men's Hoops\ud83c\udfc0","id":2897931481,"id_str":"2897931481","indices":[102,111]}],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1490756028329"}
In our case we want to accomplish several things when transforming this data in JoltTransformJson:
Create a subset of json data that contains id,
tweet text, in reply to fields and a new flow_file_id field
Match the “id” variable in the twitter payload based
on flow file variable and convert that
to a new label (tweet_id)
Set my tweet text to all lower case Set some default values for in reply to fields
that are null
Add flow file unique id to json data
Once the data has been transformed it will land on the file system as well as within a Mongo db repository.
Basic Flow of Twitter Data Transformation and Storage
Here's a close up of the specification in use:
[{
"operation": "shift",
"spec": {
"${id.var}": "tweet_id",
"text": "tweet_text",
"in_reply_to_*": "&"
}
},{
"operation": "modify-overwrite-beta",
"spec": {
"tweet_text": "=toLower"
}
},{
"operation": "modify-default-beta",
"spec": {
"~in_reply_to_status_id": 0,
"~in_reply_to_status_id_str": "",
"~in_reply_to_user_id": "",
"~in_reply_to_user_id_str": 0,
"~in_reply_to_screen_name": ""
}
},{
"operation": "default",
"spec":{
"flow_file_id" : "${uuid}"
}
}]
In the above you’ll see we’ve this accomplished with a
chain specification containing four operations (shift, modify-overwrite,
modify-default,
and default). The shift
helps to define the fields needed for the final schema and translates those
fields into new labels. Note the shift’s
specification uses expression language on the left side (${id.var}) that will
evaluate to a value populated by the UpdateAttribute processor (this value could
also be populated from the Variable Registry). The Jolt library will then
attempt to match that value to the corresponding label in the incoming json
data and change it to the new label (in this case “tweet_id”) on the
right.
The next operation uses modifier-overwrite to ensure that
for all the tweet text coming in we apply the Jolt lower case function to that
data. We then use a modifier-default operation that applies default values to the
in_reply_to fields if those values are null.
Finally we use a basic default operation to create the new flow_file_id
field by applying expression language on the right of the field name to
dynamically create the flow file id entry.
JoltTransformJson Advanced UI with Chain Specification
New Test Attributes Modal for testing Expression Language used in Specifications
The Advanced UI (shown above) has also been enhanced to allow testing of
specifications with expression language (specifically to provide test
attributes that need to be resolved during testing). This gives users greater insight into how a
flow will behave without relying on any external dependencies such as flow file
attributes or variable registry entries.
Example of Transformed JSON (shown in Provenance)
Looking to give this a try? Feel free to download the example template on
GitHub Gist
here and import it into NiFi. The template includes the
specification described above which you can tweak and test out various
scenarios. Also if you have any questions about transforming JSON in Apache NiFi with
Jolt please comment below or reach out to the community on the Apache NiFi mailing list.
... View more
- Find more articles tagged with:
- Data Ingestion & Streaming
- How-ToTutorial
- jolt
- jolttransformjson
- json
- NiFi
Labels:
04-28-2017
06:24 PM
@John Daniels I haven't seen a problem like this personally but one thing I'd suggest is to ensure that the permissions on that file allow the ambari user to read the file. The other thing I'd try is actually reading that file (which is text/xml based) to make sure it's not corrupted in any way.
... View more
01-23-2017
04:51 PM
Glad that worked! Concerning group permission definitely a known issue, don't believe there's a public work ticket that you can follow.
... View more
01-20-2017
07:33 PM
Hi @Oliver Fletcher! Great work making it this far. Ok here's the challenge. Unfortunately right now Ranger-NiFi plugin doesn't support groups in Ranger. This is a known issue and I believe there is work pending to address it. I see you do have a user entry of oliver, however is the username set to oliver@NIFI.LOCAL ? Based on your logs that is what NiFi is expecting to find.
... View more
01-19-2017
02:47 PM
Lastly concerning the policies defined. If you could post a screen shot of what you have defined that would be helpful for me to troubleshoot as well.
... View more
01-19-2017
02:45 PM
Another thought on Solr. That actually lives behind the scenes of Ambari Infra. If you enabled auditing for the Ranger-NiFi plugin it should have populated configuration to use Solr that's behind Ambari Infra for logging (I believe it populates those values by default) . If you could post what you have configured for ranger-nifi-audit properties that would be easier for me to determine for sure.
... View more
01-19-2017
02:26 PM
1 Kudo
Ok good progress so far! One thing that stands out is the Owner for Certificate (DN) used by Ranger. The nifi log posted appears to show that "CN=ranger-1, OU=Nifi, O=GR, L=London, ST=Unknown, C=Unknown" doesn't have access. I'm assuming that is the actual DN of the certificate used by Ranger. However in the ranger-nifi-plugin-properties section the Owner for Certificate value appears as "CN=ranger-1, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown". Nifi is expecting to identify and authorize Ranger by that value, however it doesn't appear that is the actual Owner info. You should be able to update to the correct value using Ambari. So I suggest changing the owner.for.certificate in ranger-nifi-plugin-properties to match the actual value "CN=ranger-1, OU=Nifi, O=GR, L=London, ST=Unknown, C=Unknown" as described in Part 2, Step 3 i) on the community document. Just update that one field, save the configuration and restart NiFi. Behind the scenes the authorizers.xml configuration file for nifi should be updated with the values for Ranger Admin Identity. And that's what NiFi will use to identify when Ranger is attempting communication.
... View more
01-18-2017
10:21 PM
Going back to one of your early responses I think you said you saw two entries in your nifi truststore? I don't think you needed to clear those out; having two entries, that aren't duplicates shouldn't be a problem. The first entry may have been for that specific node or root CA if you used toolkit (I will need to research to check). The second would be your ranger cert.
... View more
01-18-2017
09:14 PM
The ranger-nifi-plugin-properties is actually used to configure the NiFi service repository in Ranger (the snapshot shown in Part 3 Step 2.) . Those settings help Ranger to be able to reach a secured NiFi in order to look up available rest endpoints that can be secured. When users initially enable the plugin in Ambari, update those values and choose to restart NiFi, Ambari will actually create the service repo populated with those values. The current challenge is when the Ranger plugin is enabled first without SSL settings. If a user goes back to add settings for SSL via Ambari unfortunately the api in Ranger doesn't support update of those fields by Ambari (which is why I suggested checking those settings directly in Ranger). I believe this is a known issue that has been logged (I'll confirm though). The ranger-nifi-policymgr-ssl contains the settings that lives on the NiFi host (in a java credential file) which NiFi uses to talk to Ranger in order to retrieve policies that were configured and store them in it's local cache. Usually any issues with NiFi attempting to communicate with Ranger appear in the nifi-app.log. Also in Ranger you'll be able to see if the particular node connected or not from the Audit/Plugin tab. I hope this makes sense. I'll review the document as well to see if I can make this a bit clearer.
... View more
01-18-2017
08:46 PM
I think you did it for keystores created for nifi but just wanted to check that both the keystore password and the key passwords are the same value for the key/truststores created for ranger?
... View more
01-18-2017
08:25 PM
Another thing I'd suggest is to confirm that both the keystore/truststore that you've created for Ranger to use are accessible. I would manually run a keytool -list command: e.g. keytool -list -v -keystore /etc/security/ranger-certs/keystore.jks using the password you used to create the files. I'd run it on both the truststore and the keystore to confirm they are configured properly.
... View more
01-18-2017
08:16 PM
To add I'm concerned that the settings you need for Ranger to communicate securely with NiFi are not in place. Referring to https://community.hortonworks.com/articles/60001/hdf-20-integrating-secured-nifi-with-secured-range.html if you go to section 3, please confirm that you see the entries described in step 1 & 2. If not you can enter the information directly. Unfortunately Ranger doesn't currently allow us to update that setting through Ambari after it's initially created using Ambari.
... View more
01-18-2017
08:09 PM
1 Kudo
Hi @Oliver Fletcher, What configuration do you have for the ranger_nifi_plugin_properties? Also which logs did you see this error (Ranger or NiFi)?
... View more
12-30-2016
03:23 PM
2 Kudos
One thing I'd suggest is using ExtractText or EvaluateJsonPath (if you have Json data) after the split to search for and extract the filename field for each fragment and place it in a new flow file attribute. You could then use MergeContent to merge (group) data based on the matching filename attribute (using the correlation attribute name property). Or you could route data based on filename attribute using RouteOnAttribute to some grouped location (e.g a directory for all truck A data). Or perhaps a combination of merge and routing.
... View more
12-30-2016
12:26 PM
1 Kudo
Hi @Sanaz Janbakhsh, What are you attempting to do with the fragments created from the split? Using SplitContent should produce individual flow files on your flow that contain each fragment, which can be processed in the next step (without having to loop through fragments). If you require a loop to process content within a fragment you can try using ExecuteScript to create a custom processor for your data. ExecuteScript supports multiple languages (including javascript and groovy) where you can process flow file content as needed.
... View more
11-22-2016
07:34 PM
Awesome @Mark Nguyen glad that worked out!
... View more
11-22-2016
02:28 AM
Ok looking at that exception I also see the "InvalidLoginCredential" exception that is related to NiFi determining that the credentials you provided are invalid. I'm guessing you've confirmed your credentials but just in case please confirm that your credentials are valid against the AD you are pointing to in the login-identity-providers.xml. Also I'd recommend checking that the User Search Base and User Search Filter you are using are appropriate for your AD setup. Here is an article providing details on ldap setup just in case: https://community.hortonworks.com/articles/7341/nifi-user-authentication-with-ldap.html
... View more
11-21-2016
10:15 PM
Hi @Mark Nguyen, At the top of the exception stack it reads: 2016-11-21 21:13:46,548 INFO [NiFi Web Server-20] o.a.n.w.a.c.IllegalArgumentExceptionMapper java.lang.IllegalArgumentException: The supplied username and password are not valid. Did you validate that the credentials you set in the login-identity-provider.xml for the ldap provider file are accurate?
... View more
11-07-2016
02:17 AM
@srini one thing you could try is instead of using that one attribute to bucket on, create another attribute that denormalizes all of the attributes into one string, and use that column to bucket on (still leave the other attributes in place). When you have duplicate columns this would lead to those dupes being bucketed under that one column in the subsequent operation. Then the rest of the operations would pick the unique one and then remove the denormalized column. It's a bit of a dance but I think could work. Does that makes sense? One thing I think would be good to get on the radar is upgrading Jolt in NiFi, perhaps once the modify feature upgrades from beta. I think that will help to simplify some of the hoops needed to do this type of work.
... View more
11-07-2016
02:09 AM
1 Kudo
Hi
@srini,
NiFi does support the addition of custom transformations which can be referenced with a "drop in" jar. You could create your own transformation class, which extends the jolt library, that provides the functionality you need and be available to NiFi on the file system. Given that this is a new function in a later version of Jolt I would be careful using that the as a custom jar; that may cause a conflict but I haven't tested this myself to be sure. Below are some use cases on how to apply custom functions in NiFi
Case 1 – Custom Transform Selected
In this case if the Custom option of Transform is selected then 1) a Custom Transform Class Name should be entered and 2) one or more module paths should be provided. The Custom Transform Class Name should be a fully qualified classname (e.g. eu.zacheusz.jolt.date.Dater). The Module Path can take a comma delimited list of directory locations or one or more jar files. Once these fields are populated the Advanced view will support validation and saving of the specification. A user can switch between transformation types in the UI but not custom class names & module paths.
Case 2 – Chain Transformation Selected with Custom Transformations embedded
In this case if you wants to use one or more transforms (that include custom transformations) in your specification then the Chainr spec option can be used with one or more module path provides. In this case the Custom Transform Class Name property is not required and would be ignored (since one or more custom transformations could be invoked in the spec). As in Case 1 the Advanced view will support validation and saving of specification if the required fields are populated for this case. I hope this helps! Please let me know if you have any questions.
... View more