@PeterLuo,
The "config.zip" error can be ignored. It is expected due to a cosmetic bug that is fixed in Cloudera Manager 5.13 and on.
First thing we need to know is how you know the region server did not start. What are you seeing when you try to start HBase?
Also, is it just one Region Server or are other HBase roles also not starting?
The best place to start troubleshooting a Cloudera Manager initiated start of a role is to review:
- the agent server logs on the host where the region server failed to start
- the stderr.log and stdout.log files for the process will give clues about any issues the supervisor is having starting the process.
Here is the general process of how a service starts:
- You click start in CM
- CM tells the agent to heartbeat
- the agent sends a heartbeat to CM
- CM replies with a heartbeat response
- Agent compares what it has running with what CM says should be running (and decides what to do to match what CM says)
- Agent retrieves the files necessary to start the process from CM and lays down the files
- Agent signals the supervisor process
- Supervisor checks to see if processes need to stop/start
- If starting, the supervisor will execute CM shell scripts to start the process
- Once the shell is complete, the process runs as a child process of the supervisor.
Hopefully that helps clarify the process so you can start troubleshooting.
The process's stdout.log file (in the process directory's logs directory) is a good place to start.
You can view them in Cloudera Manager by going to the role's status page and clicking the "Log Files" drop-down.