Installing Cloudera Manager Configured in a Single-user Mode with Tarballs

by Community Manager ‎04-20-2016 11:41 AM - edited ‎09-27-2016 09:10 AM

Summary

There are several installation paths for installing Cloudera Manager. This document details how to install Cloudera Manager in a single-user mode using tarballs.

Applies To

Cloudera Manager

Instructions

 

Before You Begin

 

There are several installation paths for installing Cloudera Manager.  This document details  how to install Cloudera Manager in a single-user mode using tarballs.

Download the Cloudera Manager tarball from here:

http://archive-primary.cloudera.com/cm5/cm/5/

 

Instructions

 

Install and Configure Databases

If you are using an external database, install and configure a database as described in MySQL DatabaseOracle Database, or External PostgreSQL Database. Read Cloudera Manager and Managed Service Data Stores for more information on how to configure the databa...

Refer to the following for installing OS dependencies:

http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_ig_install_path_c.ht...

 

Install the Cloudera Manager Server and Agents

 

(CDH 5 only)

On RHEL 5 and CentOS 5, Install Python 2.6 or 2.7

Tarballs contain both the Cloudera Manager Server and Cloudera Manager Agent in a single file.

To install the Cloudera Manager server and agents perform the following steps: 

  1. Download tarballs from the locations listed in Cloudera Manager Version and Download Information.
  2. Copy the tarballs and unpack them on all hosts in which you intend to install Cloudera Manager Server and Cloudera Manager Agents.  
    Note: You can do this in a location of your choosing.Place the tarball in a location where the single user has access. You might need to create a directory. Cloudera recommends using a cloudera-manager parent directory to make future upgrades easier.

The following commands assume the single user is named, “user.” 

mkdir /home/user/opt/cloudera-manager tar xzf cloudera-manager*.tar.gz -C /home/user/opt/cloudera-manager/

Note:  tarball_root corresponds to the location where the Cloudera Manager tarball was extracted. For example, tarball_root=/home/user/opt/cloudera-manager/cm-5.3.2/

Configuring Cloudera Manager Agents

Once you have installed the Cloudera Manager Server and Agents, you must configure the Cloudera Manager agents.  To do this, perform the following steps:

On every Cloudera Manager agent host, configure the Cloudera Manager agent to point to the Cloudera Manager server by setting the following properties in the tarball_root/etc/cloudera-scm-agent/config.ini configuration file: 

 

Cloudera Manager agent host properties

Property Description
server_host Name of the host where Cloudera Manager Server is running
server_port Port on the host where Cloudera Manager Server is running
      3. By default, a tarball install has a var subdirectory where state is stored that in a non-tarball install is stored in /var.

 

Cloudera recommends that you reconfigure the tarball install to use an external directory as the /var equivalent (/var or any other directory outside the tarball) so that when you upgrade Cloudera Manager, the new tarball installation can access this state.

 

4. Configure the installation to use an external directory for storing state by editing tarball_root/etc/default/cloudera-scm-agent and setting the CMF_VAR variable to the location of the /varequivalent. If you don't reuse the state directory between different tarball installations, the potential exists for duplicate Cloudera Manager agent entries to occur in the Cloudera Manager database.

 

5. Configure the MySQL JDBC driver for each agent. At the bottom of the config.ini file, you will see the following:cloudera_mysql_connector_jar=/usr/share/java/mysql-connector-java.jar

 

6. Uncomment the line and add the path to the JDBC driver. You must repeat this step for  each host/role that interacts with a MySQL database.

Note:  Examples of common host/roles include the Cloudera Manager Management Services, Hive Metastore, Oozie, and Sqoop2.

 

7. In addition to the IP address of the Cloudera Manager server, configure the parcel directory for each Cloudera Manager agent.

 

8. Ensure that the parcel_dir is configured to a directory in which the single-user has permissions to write.

 

Note:  By default, the parcels are written to /opt/cloudera/parcels, but with the single user mode, this directory might not exist.

 

a. If you have permissions to create the directory, create it with the appropriate permissions.sudo mkdir -p /opt/cloudera/parcels sudo mkdir -p /opt/cloudera/parcel-cache sudo chown user /opt/cloudera/parcels sudo chown user /opt/cloudera/parcel-cache

 

b. If you do not have permissions to write at this location, configure the parcel_dir within the config.ini file.parcel_dir=/home/user/opt/cloudera/parcel

 

9. Configure the log_dir configuration within the config.ini file for each agent.  Agent Configuration

Log file
Note:  The supervisord log file will be placed into the same directory. 
If the agent is being started via the init.d script, /var/log/cloudera-scm-agent/cloudera-scm-agent.out will also have a small amount of output (from before logging is initialized). 
log_file=/home/user/var/log/cloudera-scm-agent/cloudera-scm-agent.log

 

 

10. Create the parcel_dir and log_file directories above on all the hosts.

 

      CMF_VAR must be changed in two files for the Cloudera Manager installation:

  • One location for the Cloudera Manager server, and one location for the Cloudera Manager agent.
  • Once you configure all the agents on the first host, you can tar and copy the configurations to all the hosts within the cluster.
  • Note: None of the configurations are unique to the hosts so you can safely copy the contents to other nodes in the cluster. As before, ensure all directories are created with the appropriate permissions.

Modify the following files:

  • tarball_root/etc/default/cloudera-scm-server 
  • tarball_root/etc/default/cloudera-scm-agent

Note: The CMF_VAR is hard-coded to the standard location and must be configured to standard writable directory across all servers. Within the CMF_VAR directory, create the following directories:

  • CMF_VAR/lib
  • CMF_VAR/lib/cloudera-scm-agent/
  • CMF_VAR/log
  • CMF_VAR/log/cloudera-scm-agent
  • CMF_VAR/log/cloudera-scm-server
  • CMF_VAR/run

 

11. Set the CMF_VAR environment variable to a path that does not live within the tarball root installation directory:

- export CMF_VAR=$CMF_ROOT

+ export CMF_VAR=/home/user/var

 

There are two methods of deploying these changes across the hosts after configuring these parameters on one of the hosts:

Method 1: Tar all the contents of the tarball_root directory and distribute them across all hosts

Method 2: Copy the following files to the tarball_root directory across all hosts:

    tarball_root/etc/cloudera-scm-agent/config.ini

    tarball_root/etc/default/cloudera-scm-agent

 

12. Configure Cloudera Manager Server:

Within the tarball of the Cloudera Manager server host, modify the following file:  

    tarball_root/etc/default/cloudera-scm-server

 

13. Configure the CMF_VAR and JDBC driver location:

export CMF_VAR=/home/user/var

export CMF_JDBC_DRIVER_JAR="/path/to/jdbc/driver:/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar"

 

Note:  The Cloudera Manager server file only lives on one host, so it does not need to be copied to the rest of the machines.

 

14. Configure the MySQL JDBC Driver:

The agents need to be configured to use the JDBC driver that is installed on the hosts that interact with MySQL. The configuration is located within the config.ini file:

cloudera_mysql_connector_jar=/usr/share/java/mysql-connector-java.jar

 

15. Uncomment the line above and configure it to the appropriate location.

 

16. Configure the Cloudera Manager Database properties file using the following script:

    <tarball root>/share/cmf/schema/scm_prepare_database.sh

 

Syntax for scm_prepare_database.sh

 

scm_prepare_database.sh database-type [options] database-name username password

 

Create Parcel Directories

  1. Configure the repo for the Cloudera Manager server. The tarball unpacks the cloudera/parcel-repo file under the root tarball location
    • If you do not want to create the default location of /opt/cloudera/parcel-repo, then log into Cloudera Manager and configure the Local Parcel Repository Path parcel setting.
    • Note:  This can prevent the installation from proceeding if the directory does not exist while setting up the cluster.
  2. Ensure that the permissions are set appropriately so that the single user has read/write access.
  3. Create the parcel_dir folder you configured on the Cloudera Manager agents listed above: 

mkdir -p /home/user/opt/parcels

mkdir -p /home/user/opt/parcel-cache

 

4. Set the permissions to ensure the single user has read/write access.

 

Configuration Steps for Users:

 

Before configuring a cluster to run in single user mode, the following steps must be performed on all hosts in the cluster:

 

Note:  This example uses the cloudera-scm user:

  1. Give the single user, passwordless sudo access. You must create the user if it doesn’t exist. One way to do this is add the user to the configured sudoers group by running the command:usermod -a -G sudo cloudera-scm
  • You can also achieve this by adding a new sudo configuration for the cloudera-scm group.  To do this, run the command visduo and then add the following line:%cloudera-scm ALL=(ALL) NOPASSWD: ALL

Note:  Sudo must be configured so that /usr/sbin is in the path when running sudo. One way to achieve this is by adding the following configuration to sudoers:

 

2. Edit the /ets/sudoers file using the visduo command and add this line to the configuration file:Defaults secure_path = /sbin:/bin:/usr/sbin:/usr/bin

 

3. Setup per user limits for su prior to setting up the agent.Edit/etc/pam.d/su.

  • Uncomment: session required pam_limits.so

Note:  Configure a whitelist if your requirements dictate that you may not allow passwordless sudo for all commands.  You can find this list at the end of this document.

 

In a test environment, you can configure Cloudera Manager with passwordless sudo for all commands and audit the command syntax.

 

You can view the exact syntax of the commands on a system from the /var/log/secure logfile.

 

4. Disable the requiretty configuration for this particular user in visudo configuration file.

5. Configure the non-default single user mode configs here before proceeding:

6. For a single user username, create the process limits configuration file at /etc/security/limits.d/username.conf with the following settings:

username soft nofile 32768

username soft nproc 65536

username hard nofile 1048576

username hard nproc unlimited

username soft memlock unlimited

username hard memlock unlimited

 

Note: Where username is the user that you are configured to run as.

 

7. Configure JAVA_HOME for the single-user if the JDK package is not installed with packages.

 

Note: This can be done by setting the JAVA_HOME environment variable in the .bash_profile. 
Cloudera recommends, using Cloudera Manager: 
Hosts  > Configuration > Java Home Directory

 

8. There are four paths that must be pre-configured with the single user. Provision these on each host in case there is role movement in the future:

Roles that run on Tomcat require some directories to exist in non-configurable paths. The following directories must be created and be writable by user:

  • HDFS (HttpFS role) - /var/lib/hadoop-httpfs
  • Oozie Server - /var/lib/oozie
  • Sqoop 2 Server - /var/lib/sqoop2
  • Solr Server - /var/lib/solr

 

Configuration Steps to take Before Starting Cloudera Manager Agents and Servers

  1. If you manually install agent packages, before starting the agents, configure them to run as cloudera-scm by editing the files and uncommenting the following lines:
  2. For the server, configure the following:
  • tarball_root/etc/init.d/cloudera-scm-serverCMF_SUDO_CMD=” “ USER="user” GROUP=”user

3. For each agent, configure the following:

 

Note: The agent file doesn’t have the USER defined, but add the following above the first CMF_SUDO_CMD definition:

 

tarball_root/etc/init.d/cloudera-scm-agent  

 

USER=”user” GROUP=”user”

CMF_SUDO_CMD=” “

Configure the following for the /var/run/cloudera-scm-agent/process dir

 

4. Change the following path, to a path with the proper permissions:

 

install -d -o $CMF_DIR_OWNER -g $CMF_DIR_OWNER /var/run/cloudera-scm-agent

 

Becomes:

 

install -d -o $CMF_DIR_OWNER -g $CMF_DIR_OWNER /home/admin/opt/var/run/cloudera-scm-agent

 

Starting the Cloudera Manager Server and Agents:

  1. Configure the log directory prior to starting.
  2. Call the init scripts to configure the users.

Typically, the rpms creates these directories so creating them is necessary to get the services started.

mkdir -p /home/user/var/log/cloudera-scm-server

mkdir -p /home/user/var/log/cloudera-scm-agent

mkdir -p /home/user/var/lib/cloudera-scm-agent

 

3. Start the Cloudera Manager Server and Agent services:

tarball_root/etc/init.d/cloudera-scm-server start

tarball_root/etc/init.d/cloudera-scm-agent start

 

4. Log into Cloudera Manager Server:

5. Configure the Management services.

 

Note:  Ensure that you reconfigure any /var/lib and /var/log configurations to setup the configurations appropriately.

    a. Navigate to Cluster > Management

    b. Enter, “/var” into the Search field

    c. Navigate to Administration > Settings > Advanced > Single User Mode (Enable)

    d. Navigate to Administration > Settings > Advanced > Single User Mode User

    e. Navigate to Administration > Settings > Advanced > Single User Mode Group

 

6. Copy to Agents on other Hosts:

 

Run the following commands on all of the other agents:

 

mkdir -p /home/user/var

mkdir -p /home/user/opt

mkdir -p /home/user/var/log/cloudera-scm-agent

mkdir -p /home/user/var/lib/cloudera-scm-agent

mkdir -p /home/user/opt/parcels

 

7. Copy the files over or create a tarball of the cm-5.x.x directory you’ve modified, and ship the tarball to the new hosts.

  1. If you go down the file copy path, you will need to copy the following files:tarball_root/etc/init.d/cloudera-scm-agent tarball_root/etc/cloudera-scm-agent/config.ini tarball_root/etc/default/cloudera-scm-agent

 

Adding a Remote Repository

 

You can use Python SimpleHTTPServer to create a simple HTTP server to download the CDH parcels to a cluster without internet access.

  1. From the following location, download the appropriate CDH parcel and manifest.json file.http://archive-primary.cloudera.com/cdh5/parcels/
  2. Execute the following steps to start the parcel repo on one of your cluster’s hosts.$ cd /tmp/parcels $ wget “parcel_URL” $ wget “manifest.json” $ python -m SimpleHTTPServer 8080
  3. If everything has been configured appropriately, use the Cloudera Manager wizard to add the management services.
  4. The management services require the following directory be created for the eventserver solr index directory:

mkdir -p /home/user/var/lib/cloudera-scm-eventserver

 

Note: You may alter configuration as needed, but this directory needs to be created before first. 

 

5. Once the management services are configured, use the wizard to create the following services in this order:

 

    a. Zookeeper
    b. HDFS
    c. Yarn

 

The rest of the services depend on these core technologies. Ensure that the Single User Mode is configured.  The wizard prompts the user to review of all the configs to ensure the appropriate settings are in place before starting and initializing the services.

 

Sudo Whitelist:

By functional areas, each command and its arguments are delimited by comma, where the first element is the command and the subsequent ones are the arguments.

 

Quoted strings are literals and static, whereas an arbitrary string like, path designates a variable that can potentially take on all kinds of values.

 

The group of cluster stat commands are not necessary for functionality, only for debugging issues with cluster diagnostic bundles.

  1. cgroups

"mount", "-t", "cgroup", "-o", subsys, "cm_cgroups", path "chown", "-R", user:group, path "umount", path "cat", "/proc/cgroups" 

 

2. Client configuration

"update-alternatives", "-admindir", db_path, "altdir", alt_links_path, "-display", alt_name "update-alternatives", "--remove", alt_name, alt_path "cp", "-a", to_copy, dest_file + ".cm_tmp" "mv", "-Tf", dest_file + ".cm_tmp", dest_file "mkdir", "-p", root_dir_name "cp", "-a", special_file, conf_dir + "/" + directory_name "rm", "-rf", dest_path "cp", "-a", conf_dir + "/" + directory_name, dest_path "chown", deployed_file_user, dest_path "chmod", "-R", "ugo+r", dest_path "update-alternatives", test_ua_flags, "--install", alt_link, alt_name, dest_path, priority "update-alternatives", test_ua_flags, "--auto", alt_name "update-alternatives", "--remove", name, source "mkdir", "-p", dest_dirname "update-alternatives", "--install", dest, name, source, priority

 

3. parcel

"chmod", permissions, path "chown", user_group, path

 

4. tmpfs

"mount", "-t", "tmpfs", "cm_processes", path "mount", "-t", "tmpfs", "cm_processes", path, "-o", "mode=0751" "umount", path 

 

5. Special handling for taskcontroller.cfg, container-executor.cfg (secure cluster or when explicitly configured in mr1/yarn)

"mkdir", "-p", "-m", "755", deploy_dir "cp", "-p", orig_file, dest_file "chown", user:group, f "chmod", permissions, f 

 

6. Cluster stats

Note:  Only a subset of these really need sudo but currently they're all done under sudo.

 

"df", "-k" "df", "-i" "cat", "/proc/cpuinfo" "cat", "/proc/meminfo" "cat", "/proc/interrupts" "cat", "/proc/mounts" "cat", "/proc/swaps" "cat", "/proc/diskstats" "cat", "/proc/partitions" "cat", "/proc/vmstat" "uptime" "cat", "/etc/security/limits.conf" "ifconfig", "-a" "ethtool", "eth0" "ethtool", "-S", "eth0" "lsmod" "chkconfig", "--list" "lspci" "lscpu" "lshw" "lsb_release", "-a" "java", "-version" "uname", "-a" "uname", "-r" "cat", "/etc/issue" "cat", "/etc/redhat-release" "cat", "/etc/hosts" "cat", "/etc/resolv.conf" "cat", "/etc/nsswitch.conf" "/sbin/sysctl", "-A" "service", "--status-all" "dmidecode" "rpm", "-qa" "dpkg", "-l" "vmstat" "curl", "-m", "1", "http://169.254.169.254/2011-01-01/meta-data/instance-type" "lsof", "-n", "-P" "ps", "aux" "cat", "/etc/host.conf" "ls", "/etc/yum.repos.d" "cat", "/etc/apt/sources.list" "zypper", "repos", "-d" "lvdisplay" "dmesg" "cat", "/etc/sysconfig/selinux" "cat", "/etc/suse-release" "sar", "-A" "cat", "/proc/sys/vm/swappiness" "iptables", "-L", "-v", "-n" "top", "-b", "-n 1" "cat", "/etc/sysconfig/network-scripts/ifcfg-eth0" "cat", "/etc/sysconfig/network/ifcfg-eth0" "cat", "/etc/sysconfig/network" "cat", "/etc/hostname" "hostname", "--fqdn" "netstat", "-s" "grep", "-r", ".", "/sys/kernel/mm" "bash", "-c", "for x in /etc/security/limits.d/*; do echo \"file $x:\"; cat $x; echo; echo; done" "bash", "-c", "PATH=/usr/bin:/usr/sbin:\"$PATH\"; " + "for x in /etc/alternatives/*; do " + " echo $x; " + " update-alternatives --display $(basename $x); " + " echo -------------; " + "done" "cat", "/var/log/messages" "cat", "/var/log/kern.log" "date" "ntpstat" "ntpq", "-pn" "cat", "/etc/krb5.conf" "cat", "/var/kerberos/krb5kdc/kdc.conf" "cat", "/var/kerberos/krb5kdc/kadm5.acl" "python", "-c", "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())" "nslookup", "-query=any", CM_HOST_NAME "dig", "any", CM_HOST_NAME "host", "-v", "-t", "A", CM_HOST_NAME

References

Contributors
Disclaimer: The information contained in this article was generated by third-parties and not by Cloudera or it's personnel. Cloudera cannot guarantee its accuracy or efficacy. Cloudera disclaims all warranties of any kind and users of this information assume all risk associated with it and with following the advice or directions contained herein. By visiting this page, you agree to be bound by the Terms and Conditions of Site Usage , including all disclaimers and limitations contained therein.