Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Add services stuck at "Preparing to Deploy: 5 of 11 tasks completed" forever

Explorer

I add services on ambari,everything works fine up to this point,after which it remains at "Preparing to Deploy: 5 of 11 tasks completed."forevery. and ambari-server.log no obvious errors.

I'm sure hosts iptables selinux and the time has been synced.because other services have been installed successful.
By the way,there were error in hosts and some nodes hostname,but it's all right now.

So is such a problem a database?

9 REPLIES 9

Explorer

@Jay Kumar SenSharma

Can you help me?This question has been bothering me for a long time.

Super Mentor

@lei lin

Can you please let us know the following:

1. What is the exact ambari version?

2. Do you see any error/warning in your "/var/log/ambari-server/ambari-server.log" ?

3. Do you see any error/warning in the "/var/log/ambari-agent/ambari-agent.log" of any of the cluster host?

4. Do you see any error in the Ambari UI when you open the browser Developer Tool

Like Google Chrome --> Click on the "More Tools" --> Developers Tool --> Console Tab

Then in that browser open the Ambari url which is hanging.

Do you notice any error in the Browser debugger console?

Super Mentor

@lei lin

Also as in the mentioned phase the Ambari should try to deploy some Yum packages. So can you please check that you do not see any error in the "/var/log/yum.log" or if there is any issue at yum proxy level.

Can you perform a yum clean all on all the host once.

 # yum clean all

Explorer

I run yum clean all and yum repolist,yum is all right

Explorer

1.ambari-server version is: 2.4.1.0-22 HDP version: HDP-2.5

2.I executed ambari-server stop and ambari-server start ;I found some error in /var/log/ambari-server/ambari-server.log:

14 Mar 2018 14:06:11,988 ERROR [qtp-ambari-agent-64] HeartBeatHandler:200 - CurrentResponseId unknown for worker7.hadoop.ds.com - send register command

Other host in the cluster has this error. include worker6.hadoop.com worker5.hadoop.com worker4.hadoop.com

3.restarted the ambari-server,other host log:/var/log/ambari-server/ambari-agent.log:

25375 ERROR 2018-03-14 09:16:33,391 Controller.py:415 - Connection to master1.hadoop.ds.com was lost (details=Request to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com failed due to Error occured during connecting to the server: )
25376 INFO 2018-03-14 09:17:10,309 NetUtil.py:62 - Connecting to https://master1.hadoop.ds.com:8440/connection_info
25377 WARNING 2018-03-14 09:17:10,310 NetUtil.py:93 - Failed to connect to https://master1.hadoop.ds.com:8440/connection_info due to [Errno 111] Connection refused
25378 INFO 2018-03-14 09:17:10,310 security.py:100 - SSL Connect being called.. connecting to the server
25379 ERROR 2018-03-14 09:17:10,310 Controller.py:415 - Unable to reconnect to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com (attempts=1, de
      tails=Request to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com failed due to [Errno 111] Connection refused)
25380 INFO 2018-03-14 09:17:24,215 NetUtil.py:62 - Connecting to https://master1.hadoop.ds.com:8440/connection_info
25381 WARNING 2018-03-14 09:17:24,215 NetUtil.py:93 - Failed to connect to https://master1.hadoop.ds.com:8440/connection_info due to [Errno 111] Connection refused
25382 INFO 2018-03-14 09:17:24,216 security.py:100 - SSL Connect being called.. connecting to the server
25383 ERROR 2018-03-14 09:17:24,216 Controller.py:415 - Unable to reconnect to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com (attempts=2, details=Request to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com failed due to [Errno 111] Connection refused)

Explorer

Firefox:

when stuck at "Preparing to Deploy: 5 of 11 tasks completed":Click F12 networke like this:

62908-f12.jpg

Super Mentor

@lei lin

As the error says an established connection was lost ( indicates a temporary n/w issue)

Connection to master1.hadoop.ds.com was lost (details=Request to https://master1.hadoop.ds.com:8441/agent/v1/heartbeat/edge1.hadoop.ds.com failed due to Error occured during connecting to the server: )


Means there might be a temporary Network / Firewall issue (packet drop) or the network is not consistent. So in between there is a connection loss when ambari was trying to perform the "Add Service" operation.

It may be a temoporary n/w issue which might be fixed by now so can you please try cancelling the "Add Service" wizard and then re run it to see if it works.

And in order to veryfy of the ambari agents are actually able to communicate with ambari server on SSL port or not then you can try running the following queries from trhe agent machines couple of times to see if the communication is OK.

# openssl s_client -connect master1.hadoop.ds.com:8441 | grep CONNECTED 

.

Explorer

It should not be a temoporary problem.because It's been a long time.
run "openssl s_client -connect master1.hadoop.ds.com:8441 | grep CONNECTED " The output is normal:
CONNECTED(00000003)

As you said in other questions,Check the "host_version" "host_version" "host" table state is "CURRENT".I really don't understand.

Explorer

@Geoffrey Shelton Okot

can you help me,I checked all the configurations are right.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.