Created on 08-06-2015 08:14 AM - edited 09-16-2022 02:37 AM
Hi, i have just noticed that my cloudera-quickstart-vm-5.3.0-0-virtualbox image is not running properly, i mean that hdfs and other components are offline.
I cannot list any files using the terminal, hue is not working and so on so forth.
[cloudera@quickstart labfiles]$ service --status-all | grep FAILED Flume NG agent is not running [FAILED] Hadoop namenode is dead and pid file exists [FAILED] Hadoop historyserver is dead and pid file exists [FAILED] Hadoop proxyserver is dead and pid file exists [FAILED] HBase master daemon is dead and pid file exists [FAILED] HBase Solr Indexer is not running [FAILED] Hive Metastore is dead and pid file exists [FAILED] Hive Server2 is dead and pid file exists [FAILED] Impala Catalog Server is dead and pid file exists [FAILED] Impala Server is dead and pid file exists [FAILED] /etc/init.d/kdump: line 48: /var/lock/kdump: Permission denied Sentry DB Store Service is dead and pid file exists [FAILED] Spark history-server is dead and pid file exists [FAILED] Spark master is dead and pid file exists [FAILED] Spark worker is dead and pid file exists [FAILED] Sqoop Server is dead and pid file exists [FAILED] /etc/init.d/sshd: line 33: /etc/sysconfig/sshd: Permission denied [cloudera@quickstart labfiles]$
I have noticed this behaviour after an update of the centos system, i do not want to restart the virtualbox image, so, is there any script that launch
again the components in order?
Thank you.
Created 08-06-2015 09:27 AM
As for the warnings in CM, most of them is CM warning you that you only have 1 node. Now, of course that's true because you're on a single VM, but Cloudera Manager will still warn you because that means you have no redundancy. But do keep this in mind - since this is all designed to be a distributed system, rebooting the VM is equivalent to resetting your entire datacenter - it'll take a minute for the services to all recover and be ready to service requests.
With 8 GB of RAM most things should still work. We recommend having more if you can, but you will find that CM and the rest of the system will struggle if you start running larger jobs on the VM. The memory tuning in CM is significantly better in the 5.4 VM that should help prevent some issues, so you may consider trying the newer version if it continues to struggle.
Created 08-06-2015 09:15 AM
I am going to update this thread in order that another people with the same problem know what to do.
I restarted the image with sudo halt command, then i updated the imageś config applying 2 cores and 8 GB of ram instead of 1 core and 4GB (default options)
After the restart, i am not able to login to HUE, and services looks offline:
[cloudera@quickstart ~]$ service --status-all | grep FAILED Flume NG agent is not running [FAILED] Hadoop datanode is not running [FAILED] Hadoop journalnode is not running [FAILED] Hadoop namenode is not running [FAILED] Hadoop secondarynamenode is not running [FAILED] Hadoop httpfs is not running [FAILED] Hadoop historyserver is not running [FAILED] Hadoop nodemanager is not running [FAILED] Hadoop proxyserver is not running [FAILED] Hadoop resourcemanager is not running [FAILED] HBase master daemon is not running [FAILED] HBase rest daemon is not running [FAILED] HBase Solr Indexer is not running [FAILED] HBase thrift daemon is not running [FAILED] Hive Metastore is not running [FAILED] Hive Server2 is not running [FAILED] Impala Catalog Server is not running [FAILED] Impala Server is not running [FAILED] Impala State Store Server is not running [FAILED] /etc/init.d/kdump: line 48: /var/lock/kdump: Permission denied Sentry DB Store Service is not running [FAILED] Solr server daemon agent is not running [FAILED] Spark history-server is not running [FAILED] Spark master is not running [FAILED] Spark worker is not running [FAILED] Sqoop Server is not running [FAILED] /etc/init.d/sshd: line 33: /etc/sysconfig/sshd: Permission denied [cloudera@quickstart ~]$ pwd /home/cloudera [cloudera@quickstart ~]$ ls cloudera-manager Downloads Music Templates cm_api.sh eclipse notes from spark course Videos datasets labfiles notes from spark course~ workspace Desktop labfiles.zip Pictures Documents lib Public [cloudera@quickstart ~]$ ./cloudera-manager --force You must run this script as root. Try 'sudo ./cloudera-manager '. [cloudera@quickstart ~]$ sudo ./cloudera-manager --force [QuickStart] Shutting down CDH services via init scripts... [QuickStart] Disabling CDH services on boot... [QuickStart] Starting Cloudera Manager services... [QuickStart] Deploying client configuration... [QuickStart] Starting CM Management services... [QuickStart] Enabling CM services on boot... [QuickStart] Starting CDH services...
________________________________________________________________________________
Success! You can now log into Cloudera Manager from the QuickStart VM's browser:
http://quickstart.cloudera:7180
Username: cloudera
Password: cloudera
[cloudera@quickstart ~]$
As you can see, the script located on desktop is trying to launch every component, but i can see errors when i launch service --status-all command:
[cloudera@quickstart ~]$ sudo service --status-all | grep FAILED Flume NG agent is not running [FAILED] Hadoop datanode is not running [FAILED] Hadoop journalnode is not running [FAILED] Hadoop namenode is not running [FAILED] Hadoop secondarynamenode is not running [FAILED] Hadoop httpfs is not running [FAILED] Hadoop historyserver is not running [FAILED] Hadoop nodemanager is not running [FAILED] Hadoop proxyserver is not running [FAILED] Hadoop resourcemanager is not running [FAILED] HBase master daemon is not running [FAILED] HBase rest daemon is not running [FAILED] HBase Solr Indexer is not running [FAILED] HBase thrift daemon is not running [FAILED] Hive Metastore is not running [FAILED] Hive Server2 is not running [FAILED] Impala Catalog Server is not running [FAILED] Impala Server is not running [FAILED] Impala State Store Server is not running [FAILED] Sentry DB Store Service is not running [FAILED] Solr server daemon agent is not running [FAILED] Spark history-server is not running [FAILED] Spark master is not running
When i try to use HUE and examine the data i wrote yesterday, i found this:
Cannot access: /user/cloudera. The HDFS REST service is not available. Note: You are a Hue admin but not a HDFS superuser (which is "hdfs"). ('Connection aborted.', error(111, 'Connection refused')) [06/Aug/2015 09:02:40 -0700] webhdfs ERROR Failed to determine superuser of WebHdfs at http://quickstart.cloudera:50070/webhdfs/v1: ('Connection aborted.', error(111, 'Connection refused')) Traceback (most recent call last): File "/usr/lib/hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py", line 149, in superuser sb = self.stats('/') File "/usr/lib/hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py", line 236, in stats res = self._stats(path) File "/usr/lib/hue/desktop/libs/hadoop/src/hadoop/fs/webhdfs.py", line 230, in _stats raise ex WebHdfsException: ('Connection aborted.', error(111, 'Connection refused'))
the terminal commands output looks like:
[cloudera@quickstart ~]$ hdfs dfs -ls / ls: Call From quickstart.cloudera/127.0.0.1 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [cloudera@quickstart ~]$ hadoop fs -ls / ls: Call From quickstart.cloudera/127.0.0.1 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [cloudera@quickstart ~]$ hadoop fs -ls / ls: Call From quickstart.cloudera/127.0.0.1 to quickstart.cloudera:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Configuration Issues on Cloudera manager is telling me this:
All Configuration Issues Cloudera QuickStart zookeeper: Service zookeeper has 1 Server. Cloudera suggests at least 3 Servers for ZooKeeper. hdfs: Service hdfs has 1 DataNode. Cloudera suggests at least 3 DataNodes for HDFS. quickstart.cloudera: Memory Overcommit Validation Threshold Memory on host quickstart.cloudera is overcommitted. The total memory allocation is 68.7 GiB bytes but there are only 7.7 GiB bytes of RAM (1.5 GiB bytes of which are reserved for the system). Visit the Resources tab on the Host page for allocation details. Reconfigure the roles on the host to lower the overall memory allocation. Note: Java maximum heap sizes are multiplied by 1.3 to approximate JVM overhead.
I do not understand this, this monocluster works well the first time with 1 core and 4GB of ram, i mean, i was able to run some spark process and access
the HDFS, now, with 2 cores and 8GB i cant?
Please, give me some advice.
Alonso
Created 08-06-2015 09:22 AM
I don't know why the services dies when you updated CentOS, I'd be curious to know what errors are in the various log files under /var/log if you simply updated CentOS packages. Restarting the machine is the easiest way to automatically restart everything in order, however in /etc/rc5.d, you can find symlinks created by the init system that are used to kill and start the services in order on shutdown and startup, respectively. if you look in that directory for symlinks starting with S, and ordered by the number following the S, you will see the complete list of services in order. But again, restarting the OS is easiest if you want to restart EVERYTHING, especially if it was some other service being updated that caused the failure in the first place.
However, the bigger issue is going to be that you're no using Linux service management anymore. I see in the logs you ran this:
sudo ./cloudera-manager --force
The --force bypasses all the safety checks that you have enough resources. If you ran this before and continued to run with 4 GB of RAM, then that is why you've been unable to access services. However this script attempts to shutdown the existing CDH services through Linux service scripts gracefully, so you shouldn't have seen all the errors about the service being dead but the pid file existing.
So now that you have launched Cloudera Manager, you should check the status of the service in Cloudera Manager's portal instead of using the service scripts. It will also take several minutes after a reboot before all services are running - Hue is among the last to be started by CM.
Created 08-06-2015 09:27 AM
As for the warnings in CM, most of them is CM warning you that you only have 1 node. Now, of course that's true because you're on a single VM, but Cloudera Manager will still warn you because that means you have no redundancy. But do keep this in mind - since this is all designed to be a distributed system, rebooting the VM is equivalent to resetting your entire datacenter - it'll take a minute for the services to all recover and be ready to service requests.
With 8 GB of RAM most things should still work. We recommend having more if you can, but you will find that CM and the rest of the system will struggle if you start running larger jobs on the VM. The memory tuning in CM is significantly better in the 5.4 VM that should help prevent some issues, so you may consider trying the newer version if it continues to struggle.
Created 08-07-2015 03:05 AM
Ok, i am going to download cloudera-quickstart-vm-5.4.2-0-virtualbox image.
Thank you very much folks
Alonso