Member since
05-30-2018
1322
Posts
715
Kudos Received
148
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4067 | 08-20-2018 08:26 PM | |
| 1963 | 08-15-2018 01:59 PM | |
| 2390 | 08-13-2018 02:20 PM | |
| 4139 | 07-23-2018 04:37 PM | |
| 5046 | 07-19-2018 12:52 PM |
12-12-2016
04:51 PM
I ran into this timeout before. I found I had ranger enabled on kafka and the api would time out. can you verify you have ranger disabled on kafka topic, or give correct permission to atlas user.
... View more
12-12-2016
04:46 PM
You can use hbase lily indexer, which will atomically index data from hbase into solr. You can also use apache nifi, ingest once, and fork to solr, hbase, hdfs, and hive. highly flexible implemenation model
... View more
12-12-2016
04:29 PM
1 Kudo
@Avijeet Dash Solr competes for resources similar to hbase. It does not run on yarn unless you run it in slider. Controling these resources from CPU and ram perspective becomes challenges. Even if you could isolate ram and CPU, as Terry mentioned, then you sill have IO contention. I do not recommend running solr on hdfs based on my implementation experience. Have it use local fast disk (ie SSD) and let it run.
... View more
12-11-2016
02:44 AM
I highly recommend you use Apache NiFi instead of flume for most if not all data movement into and out of hadoop. For your use case, there is a prebuilt nifi template to push tweets to hive and solr (for searching and trending) https://community.hortonworks.com/articles/1282/sample-hdfnifi-flow-to-push-tweets-into-solrbanana.html If you want further analysis (ie most used words), this can be done several ways. 1. for real time, use spark streaming with nifi. microbatch your counts 2. batch, run hive sql from nifi 3. batch, call hive script from nifi to calculate analysis every x internal 4. batch, setup oozie job to calculate analysis every internal (may be kicked off from nifi as well). so you have options. hope that helps.
... View more
12-09-2016
08:13 PM
2 Kudos
can you show use the policy where you provide hive user update permission on table
... View more
12-09-2016
05:52 PM
2 Kudos
@ANSARI FAHEEM AHMED can you identifiy which components you need help tuning? hive, pig, mapreduce jobs, etc? Hive performance tuning http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_performance_tuning/bk_performance_tuning-20150721.pdf http://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/ Network https://community.hortonworks.com/articles/8563/typical-hdp-cluster-network-configuration-best-pra.html 10 ways to get most out of your cluster http://www.slideshare.net/Hadoop_Summit/radia-srinivas-june261120amroom210c
... View more
12-08-2016
03:22 PM
1 Kudo
You can install anaconda but will need to set the correct path during job execution PYSPARK_PYTHON=/opt/anaconda/bin/python spark-submit spark-job.py
... View more
12-08-2016
03:15 PM
3 Kudos
@Sampat Budankayala can you add more details on what you are asking? I assume you might be asking on how to move data from Sharepoint to HDFS. I would use apache NiFi and pull data via sharepoint rest api (https://msdn.microsoft.com/en-us/library/office/jj860569.aspx). if it is bulk load option then a java client may be best. this processor does not exist yet but should be very easy https://github.com/OfficeDev/Office-365-SDK-for-Java
... View more
12-07-2016
04:52 AM
@Sami Ahmad can you verify you have run ranger ldap sync.
... View more
12-06-2016
03:25 PM
@hello hadoop I would check here to determine if option is available http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-upgrade/content/upgrading_hdp_stack.html From the docs above the option is available on 2.3. Also please check the pre reqs prior to upgrading http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-upgrade/content/upgrading_HDP_prerequisites.html
... View more