Reply
New Contributor
Posts: 2
Registered: ‎04-02-2018

CDSW application process down

[ Edited ]

Hi All,

 

We have installed CDSW 1.3.0.p1.244221 and configured master and woker nodes. Although there were some issues about docker, we have solved the problems.

However, application service still has problem and we cannot use CDSW as following;

 

 

2018-04-02 18:30:10,812 INFO cdsw.status:OK: Application running as root check
2018-04-02 18:30:10,853 INFO cdsw.status:OK: Sysctl params check
2018-04-02 18:30:10,900 INFO cdsw.status:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|      NAME     |   STATUS   |           CREATED-AT          |   VERSION   |   EXTERNAL-IP   |          OS-IMAGE         |         KERNEL-VERSION         |   GPU   |   STATEFUL   |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|   011   |    True    |   2018-04-02 09:23:46+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    2    |    False     |
|   012   |    True    |   2018-04-02 09:23:58+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    2    |    False     |
|   013   |    True    |   2018-04-02 09:23:58+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    2    |    False     |
|   014   |    True    |   2018-04-02 09:23:58+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    2    |    False     |
|   015   |    True    |   2018-04-02 09:23:58+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    2    |    False     |
|   016   |    True    |   2018-04-02 09:23:58+00:00   |   v1.6.11   |       None      |   CentOS Linux 7 (Core)   |   3.10.0-514.26.2.el7.x86_64   |    3    |    False     |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2018-04-02 18:30:10,901 INFO cdsw.status:6/6 nodes are ready.
2018-04-02 18:30:11,018 INFO cdsw.status:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|                  NAME                 |   READY   |    STATUS   |   RESTARTS   |           CREATED-AT          |       POD-IP       |      HOST-IP       |   ROLE   |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
|             etcd-011            |    1/1    |   Running   |      0       |   2018-04-02 09:24:49+00:00   |   PublicIP   |   PublicIP   |   None   |
|        kube-apiserver-011       |    1/1    |   Running   |      0       |   2018-04-02 09:24:43+00:00   |   PublicIP   |   PublicIP   |   None   |
|   kube-controller-manager-011   |    1/1    |   Running   |      0       |   2018-04-02 09:24:44+00:00   |   PublicIP   |   PublicIP   |   None   |
|       kube-dns-3911048160-tzk9d       |    2/3    |   Running   |      7       |   2018-04-02 09:23:57+00:00   |    100.66.X.X    |   PublicIP   |   None   |
|            kube-proxy-2hqfv           |    1/1    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            kube-proxy-40711           |    1/1    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            kube-proxy-65x0h           |    1/1    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            kube-proxy-czf60           |    1/1    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            kube-proxy-nl29r           |    1/1    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            kube-proxy-xh8vl           |    1/1    |   Running   |      0       |   2018-04-02 09:23:57+00:00   |   PublicIP   |   PublicIP   |   None   |
|        kube-scheduler-011       |    1/1    |   Running   |      0       |   2018-04-02 09:24:59+00:00   |   PublicIP   |   PublicIP   |   None   |
|            weave-net-6z0wb            |    2/2    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            weave-net-fnr5h            |    2/2    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            weave-net-h8nvl            |    2/2    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            weave-net-nwtm8            |    2/2    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            weave-net-tn7wn            |    2/2    |   Running   |      0       |   2018-04-02 09:23:58+00:00   |     PrivateIP     |     PrivateIP     |   None   |
|            weave-net-w7f62            |    2/2    |   Running   |      0       |   2018-04-02 09:23:57+00:00   |   PublicIP   |   PublicIP   |   None   |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
2018-04-02 18:30:11,020 INFO cdsw.status:All required pods are ready in cluster kube-system.
2018-04-02 18:30:11,026 INFO cdsw.status:
2018-04-02 18:30:11,026 ERROR cdsw.status:Pods not ready in cluster default ['role/cron', 'role/db', 'role/engine-deps', 'role/ingress-controller', 'role/livelog', 'role/reconciler', 'role/spark-port-forwarder', 'role/web'].
2018-04-02 18:30:11,033 ERROR cdsw.status:Application services are incomplete. [web, db, livelog] not found.
2018-04-02 18:30:11,041 ERROR cdsw.status:Config maps are incomplete. [internal-config, installer-config] not found.
2018-04-02 18:30:11,070 ERROR cdsw.status:Secrets are incomplete. [docker-creds, internal-secrets, external-secrets] not found.
2018-04-02 18:30:11,075 ERROR cdsw.status:Persistent volumes are incomplete.
2018-04-02 18:30:11,081 ERROR cdsw.status:Persistent volume claims are incomplete.
2018-04-02 18:30:11,087 ERROR cdsw.status:Ingresses are imcomplete.
2018-04-02 18:30:11,087 INFO cdsw.status:Checking web at url: http://ourpublicwebaddress
2018-04-02 18:30:11,088 ERROR cdsw.status:Web is not yet up.
2018-04-02 18:30:11,088 INFO cdsw.monitor:{'metric': 'cdsw_status', 'timestampMs': 1522661411026, 'type': 'FAILURE', 'message': "Pods not ready in cluster default ['role/cron', 'role/db', 'role/engine-deps', 'role/ingress-controller', 'role/livelog', 'role/reconciler', 'role/spark-port-forwarder', 'role/web']. * Application services are incomplete. [web, db, livelog] not found.  ...", 'entity': {'type': 'SERVICE', 'name': 'cdsw'}}
2018-04-02 18:30:11,089 INFO cdsw.monitor:We are still in the startup tolerance timeperiod, so will report an UNKNOWN status rather than FAILURE.
2018-04-02 18:30:11,089 INFO cdsw.monitor:Sending status to CM: {'statusRecords': [{'metric': 'cdsw_status', 'timestampMs': 1522661411026, 'type': 'UNKNOWN', 'message': 'Cloudera Data Science Workbench is starting...', 'entity': {'type': 'SERVICE', 'name': 'cdsw'}}]}
2018-04-02 18:30:11,091 INFO cdsw.monitor:Successfully sent status to CM

Could someone give an advice for solving above issue?

We checked document and followed various procedures, we cannot solve the issue.

 

Plese do the needful.

 

Regards,

Junseok

 

Cloudera Employee
Posts: 441
Registered: ‎03-23-2015

Re: CDSW application process down

Hi Junseok,

I see that none of the pods are ready in CDSW. Have you tried to "cdsw reset" and then "cdsw init"?

Are you using packages or parcels install for CDSW?
Highlighted
New Contributor
Posts: 2
Registered: ‎04-02-2018

Re: CDSW application process down

The commands are not exist.

 

[root@master ~]# cdsw 
Cloudera Data Science Workbench CLI

Usage: cdsw command

Commands:
  enable    enable worker to access nfs mount on master node
  disable   disable worker to access nfs mount on master node
  status    show status information
  validate  validate installation
  logs      generate diagnostic bundle
  version   display version
  help      display help
[root@master ~]# 

I installed "Cloudera Data Science Workbench v1.3.0 (CSD) : 9bb84f6" through parcels in Cloudera Manager.

Cloudera Employee
Posts: 441
Registered: ‎03-23-2015

Re: CDSW application process down

So no cdsw reset means you are on parcel installation, is that correct? Have you tried to restart CDSW?

If still no, if you go to CM > CDSW > Instances > master > Processes > stdout & stderr, what can you see from those logs?
New Contributor
Posts: 1
Registered: ‎11-01-2018

Re: CDSW application process down

[ Edited ]

Hey I am experiencing the same errors even. I have tried restart too. I have checked the logs and everything looks good except these lines.

 

++ unset HTTP_PROXY
++ unset HTTPS_PROXY
++ unset NO_PROXY
++ unset SOCKS_PROXY
++ unset ALL_PROXY
++ unset http_proxy
++ unset https_proxy
++ unset no_proxy
++ unset socks_proxy
++ unset all_proxy
++ unset JAVA_HOME

++ die_on_error 0 'CDSW_SCRATCH_DIR not set.'
++ result=0
++ shift
++ err_msg='CDSW_SCRATCH_DIR not set.'
++ '[' 0 -eq 0 ']'
++ return
++ mkdir -p /etc/cdsw/scratch
++ die_on_error 0 'Unable to create scratch dir [/etc/cdsw/scratch].'
++ result=0
++ shift
++ err_msg='Unable to create scratch dir [/etc/cdsw/scratch].'
++ '[' 0 -eq 0 ']'
++ return
++ CDSW_CONFIG_FILE=/run/cloudera-scm-agent/process/122-cdsw-CDSW_MASTER/cdsw.conf
++ is_rpm_mode
++ '[' '!' -e /opt/cloudera/parcels/CDSW-1.4.0.p1.431664/config/internal/rpm ']'
++ return 1

I do not know, even if these are common. I did install from parcels even. Please do let me know, if you have any solution. Thanks.

Announcements