Member since
06-06-2019
81
Posts
58
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
323 | 10-04-2019 07:24 AM | |
438 | 12-12-2016 03:07 PM | |
1443 | 12-07-2016 03:41 PM | |
1408 | 07-12-2016 02:49 PM | |
310 | 03-04-2016 02:35 PM |
10-04-2019
07:24 AM
2 Kudos
The Ambari installation is probably attempting to update some of the Ubuntu packages due to the installation dependencies of Ambari and the HDP components. We usually have a local repo for the O/S available installed when a fully disconnected HDP installation is performed.
... View more
04-12-2017
01:24 AM
4 Kudos
This article describes the setup of two separate KDCs in a Master/Slave configuration. This setup will allow two clusters to share a single Kerberos realm, which allows the principals to be recognized between clusters. A use case for this configuration is when a Disaster Recovery cluster is used as a warm standby. The high level information for the article was found at https://web.mit.edu/kerberos/krb5-1.13/doc/admin/install_kdc.html, while the details were worked out through sweat and tears. Execute the following command to install the Master and Slave KDC if the KDC is not already installed: yum install krb5-server The following defines the KDC configuration for both clusters. This file, /etc/krb5.conf, must be copied to each node in the cluster. [libdefaults]
renew_lifetime = 7d
forwardable = true
default_realm = CUSTOMER.HDP
ticket_lifetime = 24h
dns_lookup_realm = false
dns_lookup_kdc = false
udp_preference_limit=1
[domain_realm]
customer.com = CUSTOMER.HDP
.customer.com = CUSTOMER.HDP
[logging]
default = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
kdc = FILE:/var/log/krb5kdc.log
[realms]
CUSTOMER.HDP = {
admin_server = master-kdc.customer.com
kdc = master-kdc.customer.com
kdc = slave-kdc.customer.com
} Contents of /var/kerberos/krb5kdc/kadm5.acl: */admin@CUSTOMER.HDP * Contents of the /var/kerberos/krb5kdc/kdc.conf: [kdcdefaults]
kdc_ports = 88,750
kdc_tcp_ports = 88,750
[realms]
CUSTOMER.HDP = {
kadmind_port = 749
max_life = 12h 0m 0s
max_renewable_life = 7d 0h 0m 0s
master_key_type = aes256-cts
supported_enctypes = aes256-cts aes128-cts des-hmac-sha1 des-cbc-md5 arcfour-hmac des-cbc-md5
} Contents of /var/kerberos/krb5kdc/kpropd.acl: host/master-kdc.customer.com@CUSTOMER.HDP
host/slave-kdc.customer.com@CUSTOMER.HDP Now start the KDC and kadmin processes on the Master KDC only: shell% systemctl enable krb5kdc
shell% systemctl start krb5kdc
shell% systemctl enable kadmin
shell% systemctl start kadmin The KDC database is then initialized with the following command, executed from the Master KDC: shell% kdb5_util create -s
Loading random data
Initializing database '/var/kerberos/krb5kdc/principal' for realm 'CUSTOMER.HDP',
master key name 'K/M@CUSTOMER.HDP'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key: <db_password>
Re-enter KDC database master key to verify: <db_password> An administrator must be created to manage the Kerberos realm. The following command is used to create the administration principal from the Master KDC: shell% kadmin.local -q "addprinc admin/admin"
Authenticating as principal root/admin@CUSTOMER.HDP with password.
WARNING: no policy specified for admin/admin@CUSTOMER.HDP; defaulting to no policy
Enter password for principal "admin/admin@CUSTOMER.HDP": <admin_password>
Re-enter password for principal "admin/admin@CUSTOMER.HDP": <admin_password>
Principal "admin/admin@CUSTOMER.HDP" created.
Host keytabs must now be created for the SLAVE KDC. Execute the following commands from the Master KDC: shell% kadmin
kadmin: addprinc -randkey host/master-kdc.customer.com
kadmin: addprinc -randkey host/slave-kdc.customer.com Extract the host key for the Slave KDC and store it on the hosts keytab file, /etc/krb5.keytab.slave: kadmin: ktadd –k /etc/krb5.keytab.slave host/slave-kdc.customer.com Copy /etc/krb5.keytab.slave to slave-kdc.customer.com and rename the file to /etc/krb5.keytab Update /etc/services on each KDC host, if not present: krb5_prop 754/tcp # Kerberos slave propagation Install xinetd on the hosts of the Master and Slave KDC, if not already installed, to enable kpropd to execute: yum install xinetd Create the configuration for kpropd on both the Master and Slave KDC hosts: Create /etc/xinetd.d/krb5_prop with the following contents. Create /etc/xinetd.d/krb5_prop with the following contents.
service krb_prop
{
disable = no
socket_type = stream
protocol = tcp
user = root
wait = no
server = /usr/sbin/kpropd
} Configure xinetd to run as a persistent service on both the Master and Slave KDC hosts: systemctl enable xinetd.service
systemctl start xinetd.service Copy the following files from the Master KDC host to the Slave KDC host: /etc/krb5.conf
/var/kerberos/krb5kdc/kadm5.acl
/var/kerberos/krb5kdc/kdc.conf
/var/kerberos/krb5kdc/kpropd.acl
/var/kerberos/krb5kdc/.k5.CUSTOMER.HDP Perform the initial KDC database propagation to the Slave KDC: shell% kdb5_util dump /usr/local/var/krb5kdc/slave_datatrans
shell% kprop -f /usr/local/var/krb5kdc/slave_datatrans slave-kdc.customer.com The Slave KDC may be started at this time: shell% systemctl enable krb5kdc
shell% systemctl start krb5kdc Script to propagate the updates from the Master KDC to the Slave KDC. Create a cron job, or the like, to run this script on a frequent basis. #!/bin/sh
#/var/kerberos/kdc-slave-propogate.sh
kdclist = "slave-kdc.customer.com"
/sbin/kdb5_util dump /usr/local/var/krb5kdc/slave_datatrans
for kdc in $kdclist
do
/sbin/kprop -f /usr/local/var/krb5kdc/slave_datatrans $kdc
done
... View more
- Find more articles tagged with:
- disaster-recovery
- How-ToTutorial
- Kerberos
- Sandbox & Learning
12-12-2016
03:07 PM
1 Kudo
@Avijeet Dash HDP Search is the basic Solr package with a tested integration to HDP. Lucidworks, who are the primary contributors to Solr, package the product. The default storage option for Solr uses the server's local disk for storage. You can see that this would cause competition for disk resources if the Solr installation is co-located with an HDP datanode. If you go with the SolrCloud option you can configure HDFS as your Solr data repository. Aside from fault tolerance and high availability, this gives you the option of adding more datanodes to your HDP cluster to handle the expected increase in disk use by SolrCloud.
... View more
12-07-2016
04:43 PM
You are welcome Mahendra. I think you will have to push the complex query to the database and place those results into a different table and then perform the sqoop command on that table. Best of luck.
... View more
12-07-2016
03:41 PM
@Mahendra Dahiya The sqoop import --query option is intended to process a single statement and there are warnings about using complex queries. The facility of using free-form query in the current version of Sqoop
is limited to simple queries where there are no ambiguous projections and
no OR conditions in the WHERE clause. Use of complex queries such as
queries that have sub-queries or joins leading to ambiguous projections can
lead to unexpected results. A few more details are at this URL: https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_free_form_query_imports
... View more
09-16-2016
08:13 PM
1 Kudo
@Eric Brosch Can you downgrade to the earlier version with this command? yum downgrade cups-libs-1.4.2-50.el6_4.5.i686
... View more
07-12-2016
02:49 PM
1 Kudo
Hey @Kit Menke Have you tried turning off UDP? If not, add the property and value: udp_preference_limit = 1 to your /etc/krb5.conf and see if that solves your problem. If you have Ambari managing the krb5.conf, go to Kerberos -> Advanced krb5.conf -> krb5-conf template and add the property to the [libdefaults] section. Save the changes and let Ambari push out the changes to the hosts. If you are managing the /etc/krb5.conf file, then you will have to add the property and push out changes yourself.
... View more
06-28-2016
07:45 PM
Did you finalize the upgrade from 2.2.6.0-2800 to 2.4.0.0-169? You can look at directory on your namenode where your fsimage file and the edits are stored and see if you are keeping info for current and the old version. The command to finalize the upgrade is: hdfs dfsadmin -finalizeUpgrade Hope this helps.
... View more
06-23-2016
01:41 PM
5 Kudos
Yes, Ambari stores all the information needed to managed the state and configuration of your cluster. Information concerning the Ambari users and the configuration of Ambari Views are two other items you will see. You can find the database DDLs on the Ambari server host in the /var/lib/ambari-server/resources directory. The table names in the schema are informatively named and if you do a quick grep on "CREATE TABLE" you can get a quick grasp on the type of data stored by Ambari. You should look at Ambari blueprints, http://hortonworks.com/blog/ambari-blueprints-delivers-missing-component-cluster-provisioning/, for info on creating your DR cluster. Blueprints were created so you could more easily create a new cluster based on information from an existing cluster.
... View more
06-13-2016
02:19 PM
1 Kudo
@Randy Gelhausen S3 does not have a directory hierarchy per se but S3 does allow the Object Keys to contain the "/" character. You are dealing with a key-value store and one object can have a key of /my/fake/directory/a/file and another can have a key value of /my/fake/directory/b/file. The objects are named similarly, and most tools that speak S3 will display the object as if they were files in a directory hierarchy, but there is no directory structure behind the objects. That is the key takeaway when dealing with S3. When you store, or retrieve, an object with S3 you have to reference the entire key for the object and the bucket that contains the key. The paradigm of directory and file are just an illusion. Use the Object Key in the method call as @jfrazee said and you should be good to go.
... View more
06-03-2016
01:54 PM
@c pat You need to first find out why the connection to the Namenode is being refused. Is the Namenode process up? You can do a quick check with ps aux | grep -i namenode If the Namenode process is up, then look at the logs in /var/log/hadoop/hdfs. You will want to look at the file with that looks like hadoop-hdfs-namenode-*.log. This should help you narrow down the cause a bit.
... View more
06-01-2016
03:22 PM
@omar harb This post provides info on resetting the admin password and also provides some background on the decision. https://community.hortonworks.com/questions/20960/no-admin-permission-for-the-latest-sandbox-of-24.html
... View more
06-01-2016
02:24 PM
@omar harb Log into Ambari as the administrator and from the top menu select Services->HDFS. Then navigate to the right side of the page and you will see a pulldown menu for "Service Actions". The list of available actions for HDFS will be shown in the list. You can perform a Start, Restart, Stop, etc. from this menu. I have uploaded a couple of screen shots showing the actions. screen-shot-2016-06-01-at-101948-am.png screen-shot-2016-06-01-at-102002-am.png
... View more
05-31-2016
06:33 PM
1 Kudo
Hello @khireswar Kalita . You do not need a Hive service keytab on the client node to make the JDBC connection to Hiveserver2. You do need to have a valid Kerberos ticket to perform the connection though. From the Linux command line you will execute the following command with your user ID: (please avoid doing this as hive or another service account) kinit <user_id> You will then be prompted for your domain password. Once you enter the password you will be granted a ticket, which you can verify with the "klist" command. You would then be able to use the beeline client for a JDBC connection like so: beeline -u "jdbc:hive2://hive2_host:10000/default;principal=hive/hive2_host@YOUR-REALM.COM"
... View more
05-26-2016
01:36 PM
Your status command looks to be misleading. If I recall correctly, the status command uses the PID contained in /var/run/ambari-server/ambari-server.pid file to perform the check. The error message indicates the PID is stale, in other words there is no process running with that PID. Please perform these commands: ambari-server stop
rm /var/run/ambari-server/ambari-server.pid
ambari-server start
ambari-server status With the old PID file out of the way, you will be able to get an accurate status from the command.
... View more
05-25-2016
02:27 PM
@Blanca Sanz The use of normal POSIX based authentication in Hadoop has weak user authentication. Hadoop provides a strong user authentication method through integration with Kerberos. When a cluster is secured, in other words Kerberos is used to provide user authentication, you will execute a kinit command to request a Kerberos ticket for a user principal from the Kerberos Key Distribution Center (KDC). A password is required by the kinit command and a ticket is delivered upon kinit completion which lets you execute commands from your CLI, etc.. If the kinit command fails, you will not have a valid ticket and your identify will not be established and your 'Hadoop' commands will fail. More information on setting up Kerberos in an HDP cluster can be found here: https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_Ambari_Security_Guide/content/ch_amb_sec_guide.html
... View more
05-12-2016
03:32 PM
@Timothy Spann Did you set both HTTP_PROXY and HTTPS_PROXY? I had an issue in the past where both had to be set, with the same value, because of some odd firewall settings. So even if https is not being used, set the variable as well. You probably already know the syntax is: export HTTP_PROXY=http://server-ip:port/ export HTTPS_PROXY=http://server-ip:port/
... View more
04-27-2016
04:31 PM
Ya @Ravi Mutyala , the temporary tables are only in use for a few minutes. My concern is also about any additional time being spent when writing the table as ORC. Probably have to run a bake off to see how it works in this case.
... View more
04-27-2016
02:32 PM
@Greenhorn Techie I used Ambari Blueprints, Node.js and deep-diff to compare the configurations for two different clusters. The blueprints were loaded into Node.js and deep-diff did a nice job of finding the true differences in the JSON objects. The HOSTGROUP definitions in the blueprints created a fair number of red herrings but the other diff objects were dead on. Assuming you can install Node.js, the deep-diff module and extract the blueprints (I know. A big assumption) you can use the following Node.js commands to generate a JSON file with the cluster differences. var myCallback = function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved.");
};
var diff = require('deep-diff').diff;
var bpa = require('/path/to/blueprints/blue-print-a.json');
var bpb = require('/path/to/blueprints/blue-print-b.json');
var difference = diff(bpa,bpb);
fs.writeFile('cluster-difference.json', JSON.stringify(difference, null, 99), myCallback);
... View more
04-27-2016
02:18 PM
@Benjamin Leonhardi Yes, these are Hive temporary tables. The feature is new'ish and I wanted to know if there are any surprises not mentioned in the language manual. Memory is one of the options for temporary table storage and I want to see if it is possible to fit the tables into memory. The tables are short-lived so I don't think ORC is a realistic choice at the moment but that could change.
... View more
04-26-2016
10:06 PM
04-21-2016
02:29 PM
I have a managed ORC table where I am setting the file stripe size to 64 MB. I do an insert into the table from another table and the size of the HDFS files in the warehouse directory are at most 4 MB in size and contain one stripe. The MR and Tez settings have enough head room to create 64 MB files. I adjusted the stripe size to 256 MB and see the same number and size of files generated.
... View more
Labels:
03-23-2016
10:09 PM
On OS X with Chrome and Safari with the same results. I am not seeing this issue on Windows 7
... View more
Labels:
03-15-2016
03:07 PM
1 Kudo
You can use Ambari to move a Namenode to another host if I recall correctly. First, select the HDFS service from the left-hand view. Then go to the Service Actions on the upper right and use the pulldown to select the action to move the Namenode. If you need to add another host to the cluster first, you can use Ambari to add the host (Hosts --> Actions --> Add New Host) prior to moving the Namenode.
... View more
03-15-2016
02:21 PM
1 Kudo
You can take a dump of the MySQL database to a text file using the mysqldump command. You can see further info on the documentation site at https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_upgrading_Ambari/content/_perform_backups_mamiu.html
... View more
03-14-2016
07:43 PM
1 Kudo
@Sridhar Babu M You can modify the service account users on the Customize Services --> Misc tab during the installation. You will also see an option to allow, or not allow, Ambari to manage the UIDs for the service accounts. That being said, I still see the Linux service users being created when all the process owners are changed to "root" during the configuration. This should be treated as either a bug, since the users are still being created when they were specifically overridden, or a feature request should be put in place to suppress the creation of service users during installation. I would not recommend installing a cluster with a single user owning every service. There needs to be a separation of privileges between the HDFS super user and the other accounts. Otherwise you may see a case where a service account performs an action (directory removal for example) that unexpectedly affects other services.
... View more
03-09-2016
03:24 PM
3 Kudos
If you are going to change the value, modify the property you are referencing via the Ambari UI. Put in a "hard coded" value for java.io.tmpdir and restart the required services. I see this as the safest option. That value is set in the python code for the Ambari server in this location: /var/lib/ambari-server/resources/stacks/HDP/2.0.6/hooks/before-ANY/scripts/params.py for an Ambari 2.1.2 installation and are cached for each Ambari agent. Modifying the core python scripts to change the default location is another option but there could be a number of side effects you are not expecting. @benoit
... View more
03-08-2016
02:18 PM
3 Kudos
You can find an account breakdown here: http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/_defining_service_users_and_groups_for_a_hdp_2x_stack.html The longer answer is that service accounts are created for each component you install with HDP. You are given the option to customize the service accounts in the Ambari installation UI during the final configuration steps.
... View more
03-04-2016
02:35 PM
1 Kudo
@wayne2chicago The following command appears to be missing from the 2.1.2.1 documents for Oracle use. Execute this command to make Ambari aware of the new driver and database so the configs can be pushed out. I also prefer to copy the ojdbc7.jar file to all the cluster nodes before I issue the command. Just in case. ambari-server setup --jdbc-db=oracle --jdbc-driver=/usr/share/java/ojdbc7.jar
... View more
03-03-2016
04:12 PM
3 Kudos
@Ram D. If you have passwordless SSH set up from the Ambari Server to all the hosts with Ambari Agents, you can use a distributed shell, such as pdsh, to issue the command to your hosts and stop the Ambari Agents. Doing so from the Ambari Server console is not an option and I do not believe it will be based on the architecture type that Ambari uses.
... View more