About sdutta

ceverett · ‎06-27-2016

Querying the Timeline Server is currently only supported via REST API calls. Please see this link for some good information about working with the Timeline Server REST API: https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1 You may also find some other helpful info here: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Writeable_APIs

shyamshaw · ‎01-11-2017

@vperiasamy It worked for me as well.. Thanks.

sdutta · ‎03-17-2016

@Mohammed Ibrahim a) I would go with the Hortonworks Sandbox, if you are doing standalone. b) Here is a lab for setting up HDP on a single node HDP https://github.com/shivajid/HortonworksOperationsWorkshop/blob/master/Lab1.md This will use HDP 2.3, you can make modifications to do HDP 2.4

sundararajan_sr · ‎03-24-2016

This is resolved. Possible Cause The main problem was the oozie not finding "/etc/tomcat/conf/ssl/server.xml". The oozie server has it own app-server; it should not therefore refer / conflict with the tomcat app server, which have deployed for our own purpose. setting CATALINA_BASE=${CATALINA_BASE:-/usr/hdp/current/oozie-server/oozie-server} setting CATALINA_TMPDIR=${CATALINA_TMPDIR:-/var/tmp/oozie} setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat It did however refer to /etc/tomcat. We had configurations settings at .bashrc, /etc/profile and /etc/init.d/tomcat re-Catalina Base and Catalina_Home. The oozie-setup.sh has references to Catalina_Base in many places. This may be the reason why it was referring to the wrong path. Solution: Code walk through on the shell files of oozie and other services, which did not start. Commented references to Catalina_Home and Catalina_Base in /etc/profile and etc/init/d/tomcat. Impact: All hadoop services have started Caution Users who may want to run Tomcat app server on the same server as Hadoop could create conflict if configurations for tomcat app server is set in the /etc/profile and etc/init.d/tomcat. The app server may either need to be run on a separate server than on the same server as oozie or enable user specific permission only through .bashrc.

afernandez · ‎03-31-2016

Ambari doesn't support that yet. We have a Jira for Ambari 3.0.0 https://issues.apache.org/jira/browse/AMBARI-14714 It will allow you to have multiple instances of the same service, and potentially at different stack versions, e.g., Spark 1.6.1, 1.7.0, etc.

sdutta · ‎03-10-2016

@Sunile Manjee We ran into a similar issue with a customer. To clean up you need to do the followin Stop all services and run the cleanup script https://cwiki.apache.org/confluence/display/AMBARI/Host+Cleanup+for+Ambari+and+Stack python /usr/lib/python2. 6 /site-packages/ambari_agent/HostCleanup.py You have elected to remove all users as well. If it is not intended then use option --skip "users" . Do you want to continue [y/n] (n) Run the above on each hosts. Next you would want to do a ambari reset and follow the steps that Scott mentioned.

sdutta · ‎03-03-2016

http://hortonworks.com/hadoop/cloudbreak/ - Check this video out. If you use S3 you should be fine, except you will not get stellar performance. It will be slower than HDFS on local storage. If you like the answer, you should hit "Accept" and give a vote :).

abajwa · ‎02-25-2016

These should help https://cwiki.apache.org/confluence/display/RANGER/How+to+configure+Solr+Cloud+with+Kerberos+for+Ranger+0.5 https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html

zack_riesland · ‎11-11-2017

I have a similar question. In my case, I need to connect to Hive using a Sas tool that only provides me with the following fields: Host(s) Port Database And then there is a tool to add "server side properties", which creates a list of key/value pairs. Can anyone tell me what server side properties I can use to force this connection to always use a specific queue? Or, a way to associate this connection with a user and associate that user with a key/value pair?

bleonhardi · ‎02-19-2016

So first: ORC indexes come in two forms, the standard indexes which are created all the time ( min/max values for each stride for each column ) and bloom filters. Normal indexes are good for range queries and work amazingly well if the data is sorted. This is normally automatic on any date column or increasing columns like ids. Bloom filters are great for equality queries of things like URLs, names, etc. on data that is not sorted. ( I.e. a customer name can happen sometimes in the data ). However boom filters take some time to compute, take some space in the indexes and do not work well for most columns in a data warehouse ( number fields like profit, sales, ... ) So they are not created by default and need to be enabled for columns: orc.bloom.filter.columns The stride size means the block of data that can be skipped by the ORC reader during a read operation based on these indexes. 10000 is normally a good number and increasing it doesn't help you much. You can play a bit with it but I doubt you will get big performance improvements by changing it. I would expect more impact from block size ( which impacts how many mappers are created ), compression ( zip is normally the best ). But by far the most impact comes from good data modeling. I.e. Sorting the data during insert, Correct number of ORC files in the folder, data types used, etc. shameless plug who explains it all a bit: http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data

Online	Offline
Last Visited	‎12-21-2017 03:09 AM

Member Since	‎09-18-2015 03:08 PM
Last Visited	‎12-21-2017 03:09 AM
Posts	100
Kudos received	96

Cloudera Community

Re: Sandbox HDFS Replication Set to 3 - Why?

Re: What are all the best possible ways to install...

Re: Manual KDC and kerberos option in ambari

Re: Atlas metadata hooks

Re: Can we able to kill particular container in th...

Re: Exporting/Querying Yarn LEVELDB Content

Re: Ranger admin install fails with "007-updateBla...

Re: What are all the best possible ways to install...

Re: Oozie conflicts with existing Tomcat installat...

Re: Does ambari allow multipule instances of zooke...

Re: Startover with ambari HDP install

Re: Scalable HDP cluster on AWS

Re: Looking for Setting up Solr with Kerberized HD...

Re: Hive Server 2 connection configured to go to a...

Re: Looking for a better explanation for "orc.row....