Member since
09-28-2015
24
Posts
19
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3226 | 06-13-2016 04:15 PM | |
1245 | 06-13-2016 10:54 AM | |
963 | 06-12-2016 01:26 PM | |
1445 | 06-12-2016 03:44 AM | |
3791 | 02-03-2016 08:55 PM |
09-16-2016
03:30 AM
Great article!
... View more
09-11-2016
07:34 PM
Thanks @Bryan Bende, I think this is a great solution. I'm currently using two attributes to help create directory structures dynamically in HDFS: /tmp/data_staging/${SourceAttribute}/${data.date}
I was planning to get around the lack of multiple correlation attribute support by doing a route on attribute processor to different merge content processors then back to a single HDFS processor using the original attributes. With your suggestion I can do a much cleaner and more flexible solution. After merging the two attributes with an UpdateAttribute processor, I'll send the data to a MergeContent processor where I'll bin the files on the new combined attribute, and then to my putHDFS processor. The merged attribute will persist (e.g., dataSource1_20160911), which I can then do something like the following to continue dynamically creating directories: /tmp/data_staging/${convergedSourceDateAttribute:substringBefore('_')}/${convergedSourceDateAttribute:substringAfter('_')}
Does that seem reasonable?
... View more
09-11-2016
12:54 PM
1 Kudo
I'm using the merge content processor and have successfully been using the "Correlation Attribute Name" to bin together like files using a single flowfile attribute with expression language. I would like to start using two attributes to bin and merge files. Is this possible and any help on proper syntax for this would be appreciated.
... View more
Labels:
- Labels:
-
Apache NiFi
08-10-2016
11:43 PM
7 Kudos
The Apache Nifi community recently released the beta version of Apache Nifi 1.0.0. This version comes with significant updates, which include a UI refresh, transition to zero master clustering, added multi-tenant authorization, and templates that are now deterministically ordered allowing for version controlled templates! The beta also boasts nine new processors, bringing the total to 165. The full list of release notes can be found
here.
Below I will do a very basic walkthrough of some of the UI changes from 0.7.0 to the 1.0.0 beta:
Original Apache Nifi 0.7.0 Flow
New Apache Nifi 1.0.0 Beta
Outside of the more modernized look and feel, there are some key UI Changes:
(1) Apache Nifi 1.0.0 beta now includes a status bar showing statistics of the overall flow, including bytes in and out, number of started and stopped processors, processors in error, last refresh, etc. (2) The beta also has new collapsable navigation and operation panes
(3) There's a new drop down menu where you can access information like flow summary, provenance, and the bulletin board (e.g., error messages).
(4) The search field is much more prominent now and allows users to search through complex flows to quickly find and jump-to processor and other elements on the flow
... View more
Labels:
06-27-2016
04:09 AM
Hi @henryon wen, I would follow these instructions for setting up email alerts in Ambari via SMTP: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_Ambari_Users_Guide/content/_configuring_notifications.html And follow Amazon's SES instructions for sending email through their SMTP interface: http://docs.aws.amazon.com/ses/latest/DeveloperGuide/send-email-smtp.html I haven't tried this yet, but it seems straightforward. Please let me know if it works.
... View more
06-13-2016
04:15 PM
1 Kudo
@Timothy Spann I've seen a great demo of this done by @Olivier Renault. Olivier - can you confirm?
... View more
06-13-2016
10:54 AM
1 Kudo
@Christopher Frankland here are some resources I would recommend checking out: https://community.hortonworks.com/articles/136/how-to-search-for-text-in-an-image.html By @Saptak Sen Another fun one using Apache Nifi by @Jeremy Dyer https://community.hortonworks.com/articles/28380/nifi-ocr-using-apache-nifi-to-read-childrens-books.html And here's a great tutorial with its use: http://hortonworks.com/hadoop-tutorial/indexing-and-searching-text-within-images-with-apache-solr/ Here's also a blog post that is older, so please check commands, but it has some important lessons when it comes to accuracy, the quality and resolution of the PDF will greatly affect your results. http://kiirani.com/2013/03/22/tesseract-pdf.html Posting this as I think it's interesting as well to examine other effects on the source data that might affect accuracy: http://www.assistivetechnology.vcu.edu/wp-content/uploads/sites/1864/2013/09/pxc3882784.pdf
... View more
06-12-2016
01:26 PM
@Micky Woo the article you're pointing to is demonstrating the tag based policy ranger technical preview. While it's not GA in the Apache Ranger .5x line, it is targeted as a development theme for .6x: https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies
... View more
06-12-2016
12:56 PM
1 Kudo
This error happens when you're trying to truncate an external table. Truncate needs to target a native/managed table or an exception will be thrown. Here's a really great reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
... View more
06-12-2016
03:44 AM
1 Kudo
@Manoj Gursahani Here are some resources on energy use cases. They don't all apply to solar energy specifically, but you can see how Hadoop could be using similarly: Webinar - http://hortonworks.com/webinar/how-iot-delivering-a-smarter-energy-grid/ http://hortonworks.com/webinar/using-big-data-enable-smarter-energy/ http://hortonworks.com/blog/making-energy-smarter-in-the-united-kingdom/ http://hortonworks.com/blog/apache-hadoop-the-energy-softgrid-and-my-imaginary-tesla/ Here's an interesting one from Stanford, although its views on streaming capabilities is certainly dated: https://energyclub.stanford.edu/big-data-and-the-smart-grid-is-hadoop-the-answer/
... View more