About afdebbas

afdebbas · ‎12-05-2016

thank you it worked

slachterman · ‎10-31-2016

You can use the Ranger API to create policies as well, so you could script the appropriate API calls given the right input data from your source OS.

blimbu · ‎05-15-2017

@Ahmad Debbas Hi Ahmad I have a similar scenario where there is a need ingest data from Sharepoint into HDFS. How were you able to implement this. Could you please share a snapshot of your nifi dataflow.

afdebbas · ‎07-23-2016

Thank you for your answer. These attributes are part of the metadata of the .msg file. Shouldn't the updateattributes processor be able to extract them? i tried it but it didnt work.

pvillard · ‎07-22-2016

Correct. In this case, you'd want to mount your Windows shared folder on your Linux machine [1]. In short, NiFi will be able to access your folder if you are able to access it from your console on your Linux host. [1] https://help.ubuntu.com/community/MountWindowsSharesPermanently

afdebbas · ‎07-21-2016

thank you it worked!

afdebbas · ‎07-28-2016

A simple "yum remove lucidworks-hdpsearch" worked for me

bbende · ‎07-11-2016

If you use NiFi you can use the ListHDFS + FetchHDFS processors to monitor an HDFS directory for new files. From there you have two options to index the documents... 1) As Sunile mentioned you could write a processor that extracts the information using Tika and then send that to PutSolrContentStream processor. There is going to be a new ExtractMediaMetadata processor in the next release, but it doesn't extract the body content, so you would likely need to implement your own processor. 2) You could send the documents (PDFS, emails, word) right from FetchHDFS to PutSolrContentStream, and configure PutSolrContentStream to use Solr's extracting request handler which uses Tika behind scenes: https://community.hortonworks.com/articles/42210/using-solrs-extracting-request-handler-with-apache.html

pminovic · ‎06-20-2016

Do you really need 3 masters on such a small cluster? How about 2 masters, putting 3rd Zookeeper on the Kafka node, or on one worker node, and having 5 worker nodes? I always prefer more computing power than "book-keeping" of master nodes. You can also put Knox and Flume on the Edge node, if you use them, of course. Then, distribute other master services on 2 masters, Ambari on one, AMS collector on the other; NN on one RM on the other (or on both if you want HA). And, yes, you can move the majority of master services later, using Ambari.

ripu · ‎06-20-2016

Hi Ahmad, Great to know that you managed But Strange it supposed to work what command did you use? i use below command in my ambari setup and it works perfectly on different os ambari-server setup –j /usr/java/jdk1.8.0_74 please refer for more http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.0.0/bk_ambari_reference_guide/content/ch_changing_the_jdk_version_on_an_existing_cluster.html btw which OS you are working with ? also i don think $PATH necessarily needs to be updated , we are already specifying "–j /usr/java/jdk1.8.0_74" Just make sure on all machines you have teh java home accessible

Online	Offline
Last Visited	‎10-30-2017 03:58 AM

Member Since	‎10-24-2017 11:24 PM
Last Visited	‎10-30-2017 03:58 AM
Posts	101
Kudos received	14

Cloudera Community

Re: Scala Flatten multi valued column into rows

Re: remove solr basic authentication

Re: lucidworks-hdpsearch uninstallation

Re: Ambari Setup - Java home

Re: ListenHTTP processor

Re: Security from Windows to Hadoop

Re: HDFS-Sharepoint

Re: Nifi extract Email Attributes

Re: Nifi GetFile Processor

Re: Nifi PutHDFS

Re: lucidworks-hdpsearch uninstallation

Re: Solr indexing

Re: Ambari Assign Masters

Re: Ambari Setup - Java home