Member since
09-29-2015
286
Posts
601
Kudos Received
60
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11458 | 03-21-2017 07:34 PM | |
2882 | 11-16-2016 04:18 AM | |
1608 | 10-18-2016 03:57 PM | |
4265 | 09-12-2016 03:36 PM | |
6213 | 08-25-2016 09:01 PM |
09-24-2018
02:53 PM
Request for @Ancil McBarnett (or anyone else who knows): Please flesh out a little on ... "You do not want Derby in your cluster."
... View more
07-24-2016
05:22 AM
Thanks @mqureshi. I understood your point and agree kerberos is always best option to secure cluster. But I am looking an alternative which I can use to secure solr like knox with ranger or something else. So do we have any alternative ?
... View more
02-09-2016
03:41 AM
@Sunile Manjee I think Ancil answer is best one 😉 You are the judge.
... View more
02-09-2016
05:32 AM
Thanks understood.
... View more
06-23-2016
12:30 PM
Thank you. Hortonworks doc is very scarce about this. Never would have I guessed such commands without your article. Awesome!
... View more
03-01-2016
02:54 PM
@Ancil: Is this guide still valid for ambari - hdp 2.3 deployment on ec2 please? The description states: "**** Just an Initial Place Holder for an Old KB on Ambari on EC2 to be updated". The manual does not mention the details like "configure nodes - especially /etc/network" and "set up hosts" as we find in this post. Thanks, Sundar
... View more
02-04-2016
07:58 PM
6 Kudos
ISSUE: Choosing the appropriate Linux file system for HDFS deployment SOLUTION: The Hadoop Distributed File System is platform independent and can
function on top of any underlying file system and Operating System.
Linux offers a variety of file system choices, each with caveats that
have an impact on HDFS. As a general best practice, if you are mounting disks solely for Hadoop data, disable ‘noatime’. This speeds up reads for files. There are three Linux file system options that are popular to choose from:
Ext3 Ext4 XFS Yahoo uses the ext3 file system for its Hadoop deployments. ext3 is
also the default filesystem choice for many popular Linux OS flavours.
Since HDFS on ext3 has been publicly tested on Yahoo’s cluster it makes
for a safe choice for the underlying file system. ext4 is the successor to ext3. ext4 has better performance with large
files. ext4 also introduced delayed allocation of data, which adds a
bit more risk with unplanned server outages while decreasing
fragmentation and improving performance. XFS offers better disk space utilization than ext3 and has much
quicker disk formatting times than ext3. This means that it is quicker
to get started with a data node using XFS. Most often performance of a Hadoop cluster will not be constrained by
disk speed – I/O and RAM limitations will be more important. ext3 has
been extensively tested with Hadoop and is currently the stable option
to go with. ext4 and xfs can be considered as well and they give some
performance benefits. References:
http://wiki.apache.org/hadoop/DiskSetup http://hadoop-common.472056.n3.nabble.com/Hadoop-performance-xfs-and-ext4-td742325.html http://www.quora.com/What-are-the-advantages-and-disadvantages-of-the-filesystems-ext2-ext3-ext4-ReiserFS-and-XFS
... View more
Labels:
04-25-2016
07:19 PM
Ancil, I have question regarding: hive.tez.container.size is multiple of yarn.scheduler.minimum-allocation-mb, why so? if yarn.scheduler.maximum-allocation-mb = 24GB, yarn.scheduler.minimum-allocation-mb = 4GB, hive.tez.container.size=5B, would not Yarn smart enough to assign 5GB to a container to satisfy tez needs? Thanks, Richard
... View more
09-15-2017
01:16 PM
@Junichi Oda You cannot manage access_log using log4j, as the configuration AccessLogValve is hardcoded in the code.
Following logs can be managed using log4j by leveraging maxBackupIndex
UserSync
TagSync
XA Portal
Below logs cannot be managed using log4j, hence will have to leverage logrotate [ a standard tool for log rotation in linux ] - Manage Ranger Admin access_log log file growth
Access Log
Or else as mentioned by @Neeraj Sabharwal you can use cron script with the find command find /var/log/ranger -mtime +30| xargs --no-run-if-empty rm
... View more
02-02-2016
08:06 PM
@Avery Long This is VERY important. Don't use Chrome. Thanks @Ancil McBarnett
... View more