About amcbarnett

amcbarnett · ‎02-13-2016

@mcarillo the yarn.nodemanager.log-dirs is on the same mounts as your hadoop data directories. See https://community.hortonworks.com/articles/1888/apache-tez-tuning-tips-solving-the-could-not-find.html

amcbarnett · ‎02-12-2016

@lobna tonn how are your tests doing? Did you decide?

amcbarnett · ‎02-10-2016

@Adi Jabkowsky Is this happening when HS2 is started ONLY or when you connect via Beeline or both? Try the following: Your hive.server2.authentication.ldap.baseDN has a blank space. Remove the blank space and restart HS2 from Hosts in Ambari #From <property> <name>hive.server2.authentication.ldap.baseDN</name> <value> </value> </property> #To <property> <name>hive.server2.authentication.ldap.baseDN</name> <value></value> </property> Remove hive.server2.authentication.ldap.Domain or set to Blank. Then log into HS2 using beeline and set your user to myuser@corp.cellcom.co.il as your login and see if it authenticates Set hive.server2.enable.doAs to False so that Hive user executes the query, If you are using a Hive AD user, Double check that the hive AD UID is the same in /etc/passwd file. Make an archive of HS2 Logs, change /etc/passwd to have the same UUID as the AD hive user, and restart HS2.

amcbarnett · ‎02-09-2016

HDP 2.3.4 Needs Ambari 2.2. You cannot use Ambari 2.1

amcbarnett · ‎02-08-2016

I recommend doing the Solr Standalone. I have always had an issue with Solr Cloud for Ranger Auditing. Are you sure in Advanced ranger-admin-site everything is set appropriately? ranger.audit.source.type = solr ranger.audit.solr.urls = http://solr_host:6083/solr/ranger_audits ranger.audit.solr.username = ranger_solr ranger.audit.solr.password = NONE If you are using the HDFS or Hive plug in did you turn Auri to Solr on?

amcbarnett · ‎02-08-2016

Yes you need Kerberos for Ranger to manage Solr. See also https://community.hortonworks.com/articles/15159/securing-solr-collections-with-ranger-kerberos.html (Updated) Or are you referring to Solr Auditing for Ranger. In that case you do not need Kerberos. For Solr Audit see the following: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_Ranger_Install_Guide/content/solr_ranger_configure_standalone.html and http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_Ranger_Install_Guide/content/audit_to_solr.html If you did the necessary install and Solr audits are not showing, I had a case where I did a ps -ef | ranger and it was running under the wrong uid. I had to kill it first and then restart from Ambari to get the Solr audits to work.

amcbarnett · ‎02-08-2016

@Sushil Saxena Your base DN should be(assuming it is NOT AD) hive.server2.authentication.ldap.baseDN: OU=People,O=xx.com Ensure that you go to the host in Ambari (not Dashboard) and restart HiveServer2 from the host list.

amcbarnett · ‎02-08-2016

Although I have heard the argument that over time, the cost of replacing disk and managing DAS disk with 3 factor replication, makes SAN cheaper, from a TCO perspective

amcbarnett · ‎02-08-2016

In addition to putting them on Master Nodes co-located to other resources, the Zookeeper and Journal should be on JBODs... see diagram

amcbarnett · ‎02-08-2016

Hadoop is Shared Nothing architecture. SAN Storage usually goes against the grain for distributed storage in a distributed compute environment. The only central storage we support so far is Isilon because we did some joint engineering with them. Even then, DAS has its advantages (as well as disadvantages mainly because of 3 factor replicator). The main issue is that compute nodes where YARN spins up containers, for every data access needs, having it on separate SAN disk means that every query or access would then have to go over network speeds and would no longer be distributed across the spindles on the storage nodes. That not only decreases access time it introduces more points of failure through switches and creates additional potential for bottleneck. Normally I would have also compromise a bit for master nodes but I just came from a client who did VMs with SAN for master nodes and performance started great but once multiple users came on board and the master nodes needed to handle more blocks, performance tanked. We wasted a week and a half moving the master components to physical nodes on a cluster with data. Painful. See a good discussion here: http://searchstorage.techtarget.com/video/Understanding-storage-in-the-Hadoop-cluster

Online	Offline
Last Visited	‎04-13-2018 03:07 PM

Member Since	‎09-29-2015 05:35 PM
Last Visited	‎04-13-2018 03:07 PM
Posts	286
Kudos received	595

Cloudera Community

Re: HIVE : counting null values based on group by

Re: ERROR 500 received - when installing the PIVOT...

Re: How do you achieve high availability in HDFS w...

Re: Why can't we use LDAP for Hadoop authenticatio...

Re: Error Installing HDB HAWQ Standby Master

Re: What type of disk (RAID 1, RAID 0, etc) should...

Re: Benchmark Hortonworks, cloudera and mapR

Re: LDAP: error code 49 when setting LDAP auth for...

Re: Uninstall Knox and installing again

Re: On an unkerborized cluster, how can I get rang...

Re: On an unkerborized cluster, how can I get rang...

Re: HiverServer2 connectivity with LDAPS

Re: SAN vs DAS(JBOD) on data node

Re: In Production Level HA Cluster Do we need dedi...

Re: SAN vs DAS(JBOD) on data node