Member since
09-18-2015
100
Posts
98
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1482 | 03-22-2016 02:05 AM | |
1023 | 03-17-2016 06:16 AM | |
1839 | 03-17-2016 06:13 AM | |
1325 | 03-12-2016 04:48 AM | |
4725 | 03-10-2016 08:04 PM |
06-27-2016
06:00 PM
Querying the Timeline Server is currently only supported via REST API calls. Please see this link for some good information about working with the Timeline Server REST API: https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1 You may also find some other helpful info here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Writeable_APIs
... View more
01-11-2017
01:39 PM
@vperiasamy It worked for me as well.. Thanks.
... View more
03-17-2016
06:16 AM
1 Kudo
@Mohammed Ibrahim a) I would go with the Hortonworks Sandbox, if you are doing standalone. b) Here is a lab for setting up HDP on a single node HDP https://github.com/shivajid/HortonworksOperationsWorkshop/blob/master/Lab1.md This will use HDP 2.3, you can make modifications to do HDP 2.4
... View more
03-24-2016
11:59 AM
1 Kudo
This is resolved. Possible Cause The main problem was the oozie not finding "/etc/tomcat/conf/ssl/server.xml". The oozie server has it own app-server; it should not therefore refer / conflict with the tomcat app server, which have deployed for our own purpose.
setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server}
setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie}
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
It did however refer to /etc/tomcat. We had configurations settings at .bashrc, /etc/profile and /etc/init.d/tomcat re-Catalina Base and Catalina_Home. The oozie-setup.sh has references to Catalina_Base in many places. This may be the reason why it was referring to the wrong path. Solution: Code walk through on the shell files of oozie and other services, which did not start. Commented references to Catalina_Home and Catalina_Base in /etc/profile and etc/init/d/tomcat. Impact: All hadoop services have started Caution Users who may want to run Tomcat app server on the same server as Hadoop could create conflict if configurations for tomcat app server is set in the /etc/profile and etc/init.d/tomcat. The app server may either need to be run on a separate server than on the same server as oozie or enable user specific permission only through .bashrc.
... View more
03-31-2016
09:12 PM
2 Kudos
Ambari doesn't support that yet. We have a Jira for Ambari 3.0.0 https://issues.apache.org/jira/browse/AMBARI-14714 It will allow you to have multiple instances of the same service, and potentially at different stack versions, e.g., Spark 1.6.1, 1.7.0, etc.
... View more
03-10-2016
11:09 PM
2 Kudos
@Sunile Manjee We ran into a similar issue with a customer. To clean up you need to do the followin Stop all services and run the cleanup script https://cwiki.apache.org/confluence/display/AMBARI/Host+Cleanup+for+Ambari+and+Stack python /usr/lib/python2. 6 /site-packages/ambari_agent/HostCleanup.py
You have elected to remove all users as well. If it is not intended then use option --skip "users" . Do you want to continue [y/n] (n) Run the above on each hosts. Next you would want to do a ambari reset and follow the steps that Scott mentioned.
... View more
03-03-2016
06:42 AM
1 Kudo
http://hortonworks.com/hadoop/cloudbreak/ - Check this video out. If you use S3 you should be fine, except you will not get stellar performance. It will be slower than HDFS on local storage. If you like the answer, you should hit "Accept" and give a vote :).
... View more
02-25-2016
06:24 PM
2 Kudos
These should help https://cwiki.apache.org/confluence/display/RANGER/How+to+configure+Solr+Cloud+with+Kerberos+for+Ranger+0.5 https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html
... View more
11-11-2017
11:42 AM
I have a similar question. In my case, I need to connect to Hive using a Sas tool that only provides me with the following fields: Host(s) Port Database And then there is a tool to add "server side properties", which creates a list of key/value pairs. Can anyone tell me what server side properties I can use to force this connection to always use a specific queue? Or, a way to associate this connection with a user and associate that user with a key/value pair?
... View more
02-19-2016
09:29 AM
4 Kudos
So first: ORC indexes come in two forms, the standard indexes which are created all the time ( min/max values for each stride for each column ) and bloom filters. Normal indexes are good for range queries and work amazingly well if the data is sorted. This is normally automatic on any date column or increasing columns like ids. Bloom filters are great for equality queries of things like URLs, names, etc. on data that is not sorted. ( I.e. a customer name can happen sometimes in the data ). However boom filters take some time to compute, take some space in the indexes and do not work well for most columns in a data warehouse ( number fields like profit, sales, ... ) So they are not created by default and need to be enabled for columns: orc.bloom.filter.columns The stride size means the block of data that can be skipped by the ORC reader during a read operation based on these indexes. 10000 is normally a good number and increasing it doesn't help you much. You can play a bit with it but I doubt you will get big performance improvements by changing it. I would expect more impact from block size ( which impacts how many mappers are created ), compression ( zip is normally the best ). But by far the most impact comes from good data modeling. I.e. Sorting the data during insert, Correct number of ORC files in the folder, data types used, etc. shameless plug who explains it all a bit: http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data
... View more