Member since
12-10-2015
76
Posts
30
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2133 | 03-10-2021 08:35 AM | |
1550 | 07-25-2019 06:34 AM | |
3430 | 04-20-2016 10:03 AM | |
2617 | 04-11-2016 03:07 PM |
04-21-2016
02:12 PM
Hi, from shell ambari-server status to see if ambari (the web UI) is running; if it's not running ambari-server start
... View more
04-21-2016
02:05 PM
https://community.hortonworks.com/questions/28395/how-to-load-the-data-from-sql-server-to-hdfs-using.html#comment-28450
... View more
04-21-2016
09:36 AM
It is not totally true: with NiFi you can get "aloha.csv" from different source (web, local file, HDFS, DB, twitter, ...), enrich the data (for example you can merge with "byebye.csv") or to modify the data, and save data (to hive in different ways, to hdfs, to local file, ...). If you eant to see the personalityof twitter user, you can not use NiFi to calculate it. For this you can use spark
... View more
04-21-2016
07:55 AM
Hi, I saw that there is HDF certification in cooming soon, when will it be available?
... View more
Labels:
- Labels:
-
Certification
-
Training
04-21-2016
07:36 AM
Hi, NiFi can comunicate whit Solr via PutSolrContentStream. (documentation) If you want NiFi have also a GetSolr process. Spark get/put information from/to Solr via API (documentation) I have never tried (I promised myself to do it) but if you want to create a strreaming process from NIFI to Spark you can try , perhaps starting with this one.
... View more
04-20-2016
10:03 AM
1 Kudo
Hi @Rendiyono Wahyu Saputro, I answer point to point: 1. You can use whatever you prefer. I used NIFI in a similar project for get and model the data from twitter 2. If you use NiFi you can process the data (the twitter API return a JSON file) to set the attributes with the values of JSON node. I suggest you to index data to Solr and use Spark for querying to Solr. After you should process the data in Spark for find the user with more retwitter tweets, for example. 3. In this page we used Storm to querying and generate JSON, for minimize the traffic to the SolrCloud (We don't have more CPU), but if you want you can queryng directly solr and return a JSON file for your app. My idea is: a- I register to your apps b- I insert my twitter account c - the app send to NiFi my personal information to follow me (add my userName in the getTwitter-processor filter) d - NiFi send to Solr my tweets e - Spark querying Solr for my username and process my data f - Spark send the result to another Solr collection (coll_B) g - the application requires the data processed to coll_B of solr in JSON format All components that I mentioned run on HDP in cluster mode. The size of the cluster depend on different factors:
- how many people will use your app (many requests to solr require a lot of CPU and RAM) for example for ingest (avg) 800 tweets we have 3 NiFi worker with 3GB of Xmx
- how heavy data processing?
- It must be in real time?
These and more are all things to be assessed for deploy a cluster.
... View more
04-20-2016
08:00 AM
1 Kudo
Hi @karthik sai, what do you see in ambari-mertrics-log (Default path /var/log/ambari-metrics-collector/)? Because it seems that you have stopped from user, there must be another reason.
... View more
04-15-2016
07:47 AM
I have run this command and stop and star the problematic nameNode; Now it's all ok. Thank you
... View more
04-14-2016
12:13 PM
This state still persist after restart of all HDFS component.
I didn't try to stop and start the datanode, but it's strange why I have 3 datanode and not 4. Also shows me the following state:
... View more