Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3225 | 10-18-2017 10:19 PM | |
3594 | 10-18-2017 09:51 PM | |
13167 | 09-21-2017 01:35 PM | |
1310 | 08-04-2017 02:00 PM | |
1661 | 07-31-2017 03:02 PM |
07-24-2016
10:06 PM
@sankar rao Can you please check the value of dfs.permissions.enabled in your hdfs-site.xml? If this is false, permissions are not enforced.
... View more
07-24-2016
05:08 AM
sure. change swappiness to 1 not 0 which is what you have. Also disable transparent_huge_page as well as transparent_hugepage defrag. When you are using Hadoop, your files are in Hadoop file system. So you don't really need to track last access time for your files. Disable that for your mount points using "noatime". This will stop linux from keeping track of last access times which is not used anyway. If you could, in your BIOS settings, change CPU and CPU frequency governance to performance mode. This is a tradeoff between power and performance. So make sure you know what you are doing. your current loads may not already be CPU bound so there may not be any point in doing this. Same things with power settings. you change those to performance for power/performance tradeoff. These are more meant to be sucking juice out of your hardware after you have done everything at OS level, so ideally you don't want to go this far.
... View more
07-24-2016
04:08 AM
@SBandaru I think this link is what you are looking for. http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/_prepare_the_environment.html Also, I am pretty sure in the newer version of Redhat, you need to setup swappiness to 1 instead of 0. I would also disable transparent_hugepage do a cat on the the following file on your OS and see if it's set to never or always. If it's at always, change it to never. /sys/kernel/mm/redhat_transparent_hugepage/defrag
Use the following command to change this value to never echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag I am not hundred percent sure but I think this requires a restart.
... View more
07-23-2016
07:50 PM
@Saurabh Kuma @Madhavi Amirneni Let me try to explain how security works from a high level and why Ranger without Kerberos is useless. Kerberos is what is used by many applications to authenticate a user. That is, to verify that the user is exactly who he says he is. Ranger is used as a next step in security. That is once you know, let's say a user name Alex is actually 'Alex', then does Alex have permissions to view a particular set of data. That is the job of Ranger. It enforces policies for users who have been authenticated. Authentication of whether a user is actually who he says he is, is done by Kerberos. Without Kerberos you don't even know, if Alex is actually Alex. That's why there is no point in trying to enforce using Ranger a policy that Alex cannot access certain datasets or doing an audit on what Alex did when he logged in. Without Kerberos you are not even sure, was it really Alex, when someone named Alex logged in. Authorization and audit are pretty much useless at that point and that's why you need a Kerberized cluster before you enable authorization/auditing using Ranger.
... View more
07-23-2016
06:01 PM
@jestin ma I wonder if doing a filter would help rather than a join and achive the same results. So instead of join, is it possible to do something like this? df1.filter(df2).groupBy(key).count().
... View more
07-23-2016
05:57 PM
1 Kudo
I think this is permission issue for the principal smanjee for Phoenix service. Can you try the following from the same node where you have squirrel and after kinit try accessing your phoenix service from command line. I think it would still fail and once you resolve that with proper permissions for this user, your squirrel issue would be resolved too. Hope this helps. kinit -ket <your keytab file> smanjee@CLOUD.HORTONWORKS.COM
... View more
07-23-2016
04:43 AM
@Aman Poonia I think what you are asking for is N+2 redundancy for namenode. This feature will be available in Hadoop 3.0. It would allow 3-5 name ndoes. Please see the following Jira. https://issues.apache.org/jira/browse/HDFS-6440
... View more
07-19-2016
01:22 AM
Hi @sujitha sanku The administration tool is Ambari. You can share the details from Ambari docs on how much details you want to share. Thanks
... View more
07-13-2016
06:22 PM
Those db's are likely for hive metastore as well as for Ambari. These services are often run on master or edge nodes.
... View more
07-13-2016
05:57 PM
@Kumar Veerapan It is not true that namenode will perform all admin functions. You need Ambari to manage the cluster. Namenode only stores the metadata for Hadoop files. As for gateway, you need these because in a large cluster you don't want clients to connect directly to the cluster and open up the cluster for clients. You would rather have gateway nodes so clients use these to access the cluster.
... View more