Support Questions

benhadoop · ‎12-15-2015

Hello community,

One question regarding Cloudera installation with Cloudera Manager and CDH packages (Path B) under RedHat 6.5:

After installating all the required packages with the command

sudo yum install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr

the host inspector is reporting problems on that hosts concerning runlevels of the services:

In essence that oozie and hadoop-httpfs have actived initialization scripts on runlevels 2-5.

Is this an actual problem for my cluster?

And why do the Cloudera Packages, which are only used in this context, do have incorrect runlevel configuration for the packages?

May there also be incorrect runlevels for other services?

Thanks for your support.

Best regards,

Benjamin

vkurien · ‎12-15-2015

Hey Ben

What I did (on ubuntu) is to look for Hadoop services in /etc/init.d and then turn them off using rc-update.d. Can't you pipe the output of your script to grep and then chkconfig --level 2345 - off?

View solution in original post

benhadoop · ‎12-16-2015

Hey vkurien,

thanks for your reply. I did as you suggested and turned them off with

for service in avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr; do sudo chkconfig $service off; done

I'm just surprised that I did exactly follow the installation Path B and it does not work out as expected.

Can somebody please confirm that this is intended behaviour or some kind of bug?

View solution in original post

benhadoop · ‎12-15-2015

One addition to that: I searched for other CDH services having init scripts for certain runlevels and found:

hadoop-hdfs-nfs3
spark-history-server
spark-master
spark-worker

For all of those, init scripts for runlevels 3-5 are active.

After a reboot, some of them actually started up (although the node was not yet started from CM):

Checking runlevels and status for service: hadoop-hdfs-nfs3
Hadoop HDFS NFS v3 service is running                      [  OK  ]
hadoop-hdfs-nfs3        0:off   1:off   2:off   3:on    4:on    5:on    6:off

Checking runlevels and status for service: spark-history-server
Spark history-server is dead and pid file exists           [FAILED]
spark-history-server    0:off   1:off   2:off   3:on    4:on    5:on    6:off

Checking runlevels and status for service: spark-master
Spark master is running                                    [  OK  ]
spark-master    0:off   1:off   2:off   3:on    4:on    5:on    6:off

Checking runlevels and status for service: spark-worker
Spark worker is dead and pid file exists                   [FAILED]
spark-worker    0:off   1:off   2:off   3:on    4:on    5:on    6:off

Are you aware of these issues with packages?

btw: The output was created from the script:

#!/bin/bash

for i in hadoop-hdfs-nfs3 spark-history-server spark-master spark-worker; do
                        echo "Checking runlevels and status for service: $i"
                        service $i status
                        chkconfig --list | grep $i
                        echo ""
done

vkurien · ‎12-15-2015

Hey Ben

What I did (on ubuntu) is to look for Hadoop services in /etc/init.d and then turn them off using rc-update.d. Can't you pipe the output of your script to grep and then chkconfig --level 2345 - off?

benhadoop · ‎12-16-2015

Hey vkurien,

thanks for your reply. I did as you suggested and turned them off with

for service in avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr; do sudo chkconfig $service off; done

I'm just surprised that I did exactly follow the installation Path B and it does not work out as expected.

Can somebody please confirm that this is intended behaviour or some kind of bug?

vkurien · ‎12-16-2015

BTW if you are installing with a base VM image even more has to be done, such as removing UUIDs

benhadoop · ‎12-16-2015

Hi vkurien, Could you please elaborate on that? Like with UUIDs, do you refer to the Cloudera Manager Agent UUID. that has to be individual per node?

vkurien · ‎12-16-2015

Yes, and so if you start from a "golden" VM image, make sure that the UUID is deleted from the image. I'm more paranoid so my scripts turn things off, make sure that the UUID is gone, then start the agents and finally the server.

benhadoop · ‎12-16-2015

Thanks for those insights. I recently got into trouble having the same UUID on all nodes, so I learned that the hard way 😉

Just for completness: The CM Agent UUID is stored in /var/lib/cloudera-scm-agent/uuid

Do you have additional similar hints when automating cluster deployment using CM and custom scripting?

vkurien · ‎12-16-2015

I'm making a foray into the API soon so I'm sure that I'll know more then.

Ensure /tmp has noexec off. This is an obvious security issue but sometimes one just wants a working cluster.
Trying to install downlevel versions of CDH is mostly a disaster on Ubuntu since the repo lists are wrong and the installation instructions (part B) happily download the wrong repo list on the agent machines and use that. This results in mysterious failures later.
Does anyone actually test the instructions - likely not!

Cloudera Community

Support Questions

Wrong runlevels of CDH packages for usage with Cloudera Manager (according to host inspector)

Failed to install cloudera-manager-agent package

Remove a host from Cloudera Manager and add it to ...

How to Simplify Spark-Submit JAR Dependency Manage...

Install Nifi on one node using Cloudera Flow Manag...

Cloudera Manager services Health History

How to parse XMLs in Cloudera Data Engineering wit...

NullPointerException during Host Inspector

Problem distributing CDH parcel to manager host

failed to install cloudera manager agent package

No package cloudera-manager-server-db available