Reply
Explorer
Posts: 8
Registered: ‎05-21-2018
Accepted Solution

Cluster install with Director hangs on KAFKA parcel

[ Edited ]

I am trying to deploy a cluster in AWS using Cloudera Director.   It appears to go smooth right up until the end with 579 / 597 steps completed.

 

GUI indicates:

"Distributing parcels: KAFKA-3.0.0-1.3.0.0.p0.40,CDH-5.15.0-1.cdh5.15.0.p0.21"

 

 

cloudera-director-server indicates over and over :

51.538 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:50:53.552 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:50:55.565 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:50:57.579 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:50:59.595 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:01.609 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:03.624 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:05.675 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:07.689 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:09.704 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:11.718 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:13.732 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}
[2018-06-21 14:51:15.745 +0000] INFO  [p-c86a21fc7a0e-DefaultBootstrapClusterJob] 1b43a3bf-7aa4-4d09-bf79-ed5f30552520 POST /api/v12/import com.cloudera.launchpad.bootstrap.cluster.UnboundedWaitForParcelStage - c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (KAFKA, 3.0.0-1.3.0.0.p0.40) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=1600, count=0, countTotal=4, warnings=null, errors=null}

 

Edit: this doesn't appear to be KAFKA specific issue.   I tried another cluster w/o KAFKA and this time it hangs on a CDH parcel:

c.c.l.b.c.UnboundedWaitForParcelStage: Waiting for parcel (CDH, 5.15.0-1.cdh5.15.0.p0.21) stage DISTRIBUTED. Current: DISTRIBUTING Past: [DISTRIBUTING, DISTRIBUTING, DISTRIBUTING]. State ApiParcelState{progress=0, progressTotal=4900, count=0, countTotal=7, warnings=null, errors=null}

 

Highlighted
Explorer
Posts: 8
Registered: ‎05-21-2018

Re: Cluster install with Director hangs on KAFKA parcel

[ Edited ]

We figured this one out.  The nodes were not able to download the parcels from the manager instance.  As it turns out we had DNS Hostnames set to NO for the VPC (uAWS).

 

The message that tipped us off was in the cloudera-scm-agent.log on each of the nodes.

 

[21/Jun/2018 19:42:05 +0000] 13795 Thread-13 downloader   INFO     Fetching torrent: http://ip-10-2-4-152.us-east-2.compute.internal:7180/cmf/parcel/download/KAFKA-3.0.0-1.3.0.0.p0.40-el7.parcel.torrent
[21/Jun/2018 19:42:05 +0000] 13795 Thread-13 https        ERROR    Failed to retrieve/stroe URL: http://ip-10-2-4-152.us-east-2.compute.internal:7180/cmf/parcel/download/KAFKA-3.0.0-1.3.0.0.p0.40-el7.parcel.torrent -> /opt/cloudera/parcel-cache/KAFKA-3.0.0-1.3.0.0.p0.40-el7.parcel.torrent <urlopen error [Errno -2] Name or service not known>
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/https.py", line 191, in fetch_to_file
    resp = self.open(req_url)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/https.py", line 186, in open 

 

Cloudera Employee
Posts: 55
Registered: ‎10-28-2014

Re: Cluster install with Director hangs on KAFKA parcel

Glad you figured this out. Thanks for posting your solution for other forum users.

 

Announcements