Member since
07-30-2019
53
Posts
136
Kudos Received
16
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
9366 | 01-30-2017 05:05 PM | |
5056 | 01-13-2017 03:46 PM | |
2488 | 01-09-2017 05:36 PM | |
1525 | 01-09-2017 05:29 PM | |
1157 | 10-07-2016 03:34 PM |
09-28-2015
09:06 PM
4 Kudos
I'll answer my own question here since I was able to work through it on a new install. With a fresh installation of HDP 2.3 with Ambari 2.1.1, you'll be
prompted during the installation to select a server (or more) to install
the NFS gateway on. This happens in the same configuration windows
during cluster configuration where you designate Datanodes, Region
Servers, Phoenix Servers, etc.. After the installation has finished, you'll see indications that the NFS Gateway is running on the choosen servers. Now what? If you go to one of the servers and do a
df - h
you won't see any new mount points. So how far down the
path did Ambari get you. If you reference back to the HDP 2.2 docs on
configuring NFS, you'll see that Ambari has started the nfs and rpcbind services for you. But now it's up to you to mount them. Follow the remaining HDP 2.2 docs to complete the process and mount the NFS gateway. The
startup process for the NFS Gateway is run as the 'hdfs' user. So
earlier documents covering proxy settings are NOT necessary.
# Mount Example (to be run as root)
mkdir / hdfs
mount - t nfs - o vers = 3 ,proto = tcp,nolock localhost: / / hdfs
User interaction details are found here. /etc/fstab Example for NFS Gateway Automount
localhost:/ /hdfs nfs rw,vers=3,proto=tcp,nolock,timeo=600 0 0
If
you're using NFS as a quick way to traverse HDFS while avoiding the
startup times of the jvm when making the 'hdfs dfs ...' call, try out
the hdfs-cli project. https://github.com/dstreev/hdfs-cli
... View more
09-26-2015
02:25 AM
3 Kudos
Looking at my current HA implementation, the root zNode for HA is hadoop-ha, followed by the name of the HA instance. IE: /hadoop-ha/HOME , where HOME is to name of the HA NN instance. So this should support your "multiple" HA systems in theory. HA NN doesn't but much load on ZooKeeper, so you're not going to have a scaling issue in that regard. I think this is an operational issue though. Customers are starting to see that ZooKeeper is used for more and more things on clusters and could be considered a place that holds "too many eggs". Another point is around upgrades. While the rolling upgrade supports an easy transition for ZooKeeper, the clusters you would serve from a single ZooKeeper instance will have limitations, from an operational standpoint. Customers with larger clusters are trending to having multiple ZooKeepers per cluster, instead of one ZooKeeper the rule them all. Especially if Storm and Kafka are involved. They apply a different type (heavier) of load then NN HA. Note: Resource Manager HA can also put quite a load on ZooKeeper on really large clusters, as it uses ZooKeeper to maintain the state of jobs.
... View more
09-25-2015
06:35 PM
For Ambari-Server, I currently make adjustments to: /etc/ambari-server/conf/log4j.properties And for the Agent, I have to create a symlink to redirect /var/log/ambari-agent. Is there a better method?
... View more
Labels:
- Labels:
-
Apache Ambari
09-25-2015
06:24 PM
2 Kudos
Recently use TDE to encrypt an HBase installation and found some interesting request for Key access by the Region Servers. Out of the box, we locked down the Key permissions to allow only the "hbase" user, since this was the user accessing the files by way of the Region Servers. During normal operations, we saw additional requests from the "nn" user and later from "hdfs". Well, "hdfs" is a user, that's fine. But "nn" is not. "nn" was setup as a principal per host for Kerberos (in IPA). We got around this by actually creating an "nn" user in IPA and granting them rights to the Key in Ranger KMS. Was that the best way? And I'm a little curious "how" the "nn" principal expressed itself as a user in hdfs operations.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache HBase
09-25-2015
06:14 PM
Before Ambari 2.1, we had to manager the NFS Gateway separately. Now it's "kind of" part of the Ambari process. At least it shows up in Ambari (HDFS Summary page) as installed and running. But I don't see a away to control the bind, etc... And there aren't any processes running like that. So what is the process for using NFS with Ambari 2.1+?
... View more
Labels:
- Labels:
-
Apache Ambari
09-25-2015
06:02 PM
Have you tried to create the component first, before attempting the installation? IE: curl -i -X -H "X-Requested-By: ambari" POST -u admin:admin http://<ambari-host>:8080/api/v1/clusters/<clustername>/hosts/<host_name>/host_components/ZOOKEEPER.
... View more
09-25-2015
05:58 PM
2 Kudos
hadoop distcp -i -log /tmp/ hdfs://xxx:8020/apps/yyyy hdfs://xxx_cid/tmp/ In this case the "xxx" is the "un-secure" cluster, while "xxx_cid" in the secure cluster. We are launching the job from the Kerberos cluster, with the appropriate kinit for the user and getting the following error: java.io.IOException: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "xxx/10.x.x.x"; destination host is: "xxx":8020; ... Caused by: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections. I thought by launching the job from the secure cluster, that we could avoid any access issues. But it appears that the processes are kicked off from the "source" cluster. In this case, that's the insecure cluster. Idea's on getting around this?
... View more
Labels:
09-25-2015
05:47 PM
1 Kudo
Managing Capacity Scheduler via text can get messy. And it requires a bit of research by the user to figure out what settings are available. Try using the Capacity Scheduler View in Ambari 2.1+. It will make managing queues much more simple. Note: You'll need to set it up first through "Manage Ambari". A sample of a complex Queue Layout. Through this interface, you can also manage newer features in YARN that map users to queues, via ACL. If you're using Ranger to secure HDP, the YARN plugin will extend this capability even more!!!
... View more
09-24-2015
07:59 PM
1 Kudo
Applied this recently as well with MySql 5.5 instance with HA (Tungsten). Haven't seen the issue on 5.6 basic install.
... View more
09-24-2015
05:05 PM
7 Kudos
As a general rule we do NOT use the default Ambari Databases. Pick one, MySQL, Oracle or PostgreSQL and have a separate instance stood up for it. Then use it for all of your repositories. It should be this way for any environment beyond a sandbox. I wouldn’t even do a POC with the defaults. Simply because the defaults are all over the place and POC can turn into production systems :). Once you’ve committed to using a repository, changing “types” is not really possible. You need to start over basically. I know a lot have asked about this in the past, but it’s a mess. Take 2 minutes in the beginning, setup an “Independent” MySQL (or other) database and use it. If you need to move the MySQL around in the future, that’s possible and more obtainable than switching types. NOTE: Ambari won’t lay down MySQL until the Hive Metastore, so even if you figure out a way to use that Metastore for Oozie, Ranger, etc.. it’s will be controlled by the Hive Service Config. So it WILL restart MySQL when you’ve allow Ambari to install it. If you didn’t catch me saying it earlier, Install a separate and independent RDBMS for your Metastores.
... View more
- « Previous
- Next »