Member since
04-21-2017
13
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
10582 | 04-27-2017 05:14 PM |
09-24-2017
07:03 PM
Hi Peter, I cannot attach the logs from the yarn as the file is too big and cannot be uploaded here. But what I have is the below logs from Ambari when it tried to starts the nodemanagers on different host but fails and it is mentioned in the attached file ambari.txt May be this helps you bit . Thanks Anish
... View more
09-24-2017
02:51 PM
Hi Geoffrey, Thanks for your reply. Why do you think That I need to format the namenode service even though I see that it is coming up fine and has no problems? My major concern is nodemanagers which are not starting up on all the hosts of the datanodes. This is what I need to get resolved rather than the namnode service. Thanks & Regards, Anish
... View more
09-24-2017
12:45 PM
Hi Peter, I stopped the complete cluster and restarted it again. But still the nodemanager on all the hosts is not coming up .While starting all the services the 4 data nodes which have nodemanagers on them are showing the following in the ambari startup log and are in the waiting state for a long time:- 2017-09-24 14:33:21,444 - Forcefully ensuring existence and permissions of the directory: /var/hadoop/yarn/log
2017-09-24 14:33:21,445 - Directory['/var/hadoop/yarn/log'] {'group': 'hadoop', 'cd_access': 'a', 'create_parents': True, 'ignore_failures': True, 'mode': 0775, 'owner': 'yarn'}
2017-09-24 14:33:21,456 - Host contains mounts: ['/sys', '/proc', '/dev', '/sys/kernel/security', '/dev/shm', '/dev/pts', '/run', '/sys/fs/cgroup', '/sys/fs/cgroup/systemd', '/sys/fs/pstore', '/sys/fs/cgroup/cpuset', '/sys/fs/cgroup/cpu,cpuacct', '/sys/fs/cgroup/memory', '/sys/fs/cgroup/devices', '/sys/fs/cgroup/freezer', '/sys/fs/cgroup/blkio', '/sys/fs/cgroup/perf_event', '/', '/proc/sys/fs/binfmt_misc', '/sys/kernel/debug', '/dev/mqueue', '/var', '/data1', '/net', '/sapmnt/home', '/sapmnt/HOME', '/proc/sys/fs/binfmt_misc'].
2017-09-24 14:33:21,456 - Mount point for directory /var/hadoop/yarn/log is /var
2017-09-24 14:33:21,457 - File['/var/lib/ambari-agent/data/yarn/yarn_log_dir_mount.hist'] {'content': '\n# This file keeps track of the last known mount-point for each dir.\n# It is safe to delete, since it will get regenerated the next time that the component of the service starts.\n# However, it is not advised to delete this file since Ambari may\n# re-create a dir that used to be mounted on a drive but is now mounted on the root.\n# Comments begin with a hash (#) symbol\n# dir,mount_point\n/var/hadoop/yarn/log,/var\n', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2017-09-24 14:33:21,458 - Directory['/var/lib/ambari-agent/data/yarn'] {'create_parents': True, 'mode': 0755}
2017-09-24 14:33:21,458 - Mount point for directory /data1/hadoop/yarn/local is /data1
2017-09-24 14:33:21,459 - Mount point for directory /data1/hadoop/yarn/local is /data1
2017-09-24 14:33:21,459 - Forcefully ensuring existence and permissions of the directory: /data1/hadoop/yarn/local
2017-09-24 14:33:21,460 - Directory['/data1/hadoop/yarn/local'] {'group': 'hadoop', 'cd_access': 'a', 'recursive_mode_flags': {'d': 'a+rwx', 'f': 'a+rw'}, 'create_parents': True, 'ignore_failures': True, 'mode': 0755, 'owner': 'yarn'} After a certain time frame a message is triggeredkilled the python after a certain time limit.
Still not able to conclude where exactly the problem is lying and how to overcome it. Thanks & regards, Anish
... View more
09-23-2017
10:05 PM
We ´have below HDP 2.6 environment. Today we did the ennoblement of HA for the Resource Manager using Ambari and after this we see that the nodemanager on hosts don't come up. below is some more information which I see in all the node manager logs residing on 4 data nodes:- 2017-09-22 11:55:27,006 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 23651 for container-id container_e11_1504517816511_0003_01_000001: 284.9 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used 2017-09-22 11:55:27,074 ERROR nodemanager.NodeManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM 2017-09-22 11:55:27,114 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042 2017-09-22 11:55:27,222 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:cleanUpApplicationsOnNMShutDown(519)) - Applications still running : [application_1504517816511_0003] 2017-09-22 11:55:27,226 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 45454 2017-09-22 11:55:27,240 INFO logaggregation.LogAggregationService (LogAggregationService.java:serviceStop(178)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService waiting for pending aggregation during exit 2017-09-22 11:55:27,241 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:abortLogAggregation(527)) - Aborting log aggregation for application_1504517816511_0003 2017-09-22 11:55:27,241 INFO logaggregation.LogAggregationService (LogAggregationService.java:stopAggregators(199)) - Waiting for aggregation to complete for application_1504517816511_0003 2017-09-22 11:55:27,243 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 45454 2017-09-22 11:55:27,264 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,424 WARN monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(545)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting. 2017-09-22 11:55:27,487 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 8040 2017-09-22 11:55:27,496 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(883)) - Public cache exiting 2017-09-22 11:55:27,496 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,499 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 8040 2017-09-22 11:55:27,506 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system... 2017-09-22 11:55:27,508 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2017-09-22 11:55:27,510 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped. 2017-09-22 11:55:27,510 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete. 2017-09-22 11:55:27,517 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at ************************************************************/ 2017-09-22 11:26:08,717 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:run(1016)) - Starting full compaction cycle 2017-09-22 11:26:08,718 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(1032)) - Level-0 table #6418: started 2017-09-22 11:26:08,718 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(1032)) - Level-0 table #6418: 0 bytes OK 2017-09-22 11:26:08,726 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(1032)) - Delete type=0 #6415 2017-09-22 11:26:08,726 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(1032)) - Manual compaction at level-0 from (begin) .. (end); will stop at (end) 2017-09-22 11:26:08,726 INFO recovery.NMLeveldbStateStoreService$LeveldbLogger (NMLeveldbStateStoreService.java:log(1032)) - Manual compaction at level-1 from (begin) .. (end); will stop at (end) 2017-09-22 11:26:08,726 INFO recovery.NMLeveldbStateStoreService (NMLeveldbStateStoreService.java:run(1023)) - Full compaction cycle completed in 9 msec 2017-09-22 11:36:58,466 INFO security.NMContainerTokenSecretManager (NMContainerTokenSecretManager.java:setMasterKey(138)) - Rolling master-key for container-tokens, got key with id -2010831289 2017-09-22 11:36:58,466 INFO security.NMTokenSecretManagerInNM (NMTokenSecretManagerInNM.java:setMasterKey(135)) - Rolling master-key for container-tokens, got key with id 1019460078 2017-09-22 11:55:26,722 ERROR nodemanager.NodeManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM 2017-09-22 11:55:26,754 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042 2017-09-22 11:55:26,871 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 45454 2017-09-22 11:55:26,876 INFO logaggregation.LogAggregationService (LogAggregationService.java:serviceStop(178)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService waiting for pending aggregation during exit 2017-09-22 11:55:26,876 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:26,880 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 45454 2017-09-22 11:55:27,049 WARN monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(545)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting. 2017-09-22 11:55:27,099 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 8040 2017-09-22 11:55:27,101 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,101 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(883)) - Public cache exiting 2017-09-22 11:55:27,102 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 8040 2017-09-22 11:55:27,105 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system... 2017-09-22 11:55:27,105 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2017-09-22 11:55:27,110 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped. 2017-09-22 11:55:27,111 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete. 2017-09-22 11:55:27,126 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at ************************************************************/ yarn-yarn-nodemanager-ls5387v4.log lines 981844-981905/981905 (END) 2017-09-22 11:55:26,957 ERROR nodemanager.NodeManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM 2017-09-22 11:55:26,979 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042 2017-09-22 11:55:27,127 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:cleanUpApplicationsOnNMShutDown(519)) - Applications still running : [application_1504517816511_0002] 2017-09-22 11:55:27,131 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 45454 2017-09-22 11:55:27,237 INFO logaggregation.LogAggregationService (LogAggregationService.java:serviceStop(178)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService waiting for pending aggregation during exit 2017-09-22 11:55:27,238 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:abortLogAggregation(527)) - Aborting log aggregation for application_1504517816511_0002 2017-09-22 11:55:27,238 INFO logaggregation.LogAggregationService (LogAggregationService.java:stopAggregators(199)) - Waiting for aggregation to complete for application_1504517816511_0002 2017-09-22 11:55:27,238 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 45454 2017-09-22 11:55:27,249 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,351 WARN monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(545)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting. 2017-09-22 11:55:27,419 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 8040 2017-09-22 11:55:27,424 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,424 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 8040 2017-09-22 11:55:27,425 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(883)) - Public cache exiting 2017-09-22 11:55:27,431 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system... 2017-09-22 11:55:27,440 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2017-09-22 11:55:27,442 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped. 2017-09-22 11:55:27,443 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete. 2017-09-22 11:55:27,461 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NodeManager at ************************************************************/ 2017-09-22 11:55:27,494 ERROR nodemanager.NodeManager (LogAdapter.java:error(69)) - RECEIVED SIGNAL 15: SIGTERM 2017-09-22 11:55:27,551 INFO mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:8042 2017-09-22 11:55:27,659 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:cleanUpApplicationsOnNMShutDown(519)) - Applications still running : [application_1504517816511_0001, application_1504517816511_0004] 2017-09-22 11:55:27,662 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 45454 2017-09-22 11:55:27,697 INFO logaggregation.LogAggregationService (LogAggregationService.java:serviceStop(178)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService waiting for pending aggregation during exit 2017-09-22 11:55:27,698 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 45454 2017-09-22 11:55:27,698 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:27,705 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:abortLogAggregation(527)) - Aborting log aggregation for application_1504517816511_0001 2017-09-22 11:55:27,705 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:abortLogAggregation(527)) - Aborting log aggregation for application_1504517816511_0004 2017-09-22 11:55:27,705 INFO logaggregation.LogAggregationService (LogAggregationService.java:stopAggregators(199)) - Waiting for aggregation to complete for application_1504517816511_0001 2017-09-22 11:55:27,705 INFO logaggregation.LogAggregationService (LogAggregationService.java:stopAggregators(199)) - Waiting for aggregation to complete for application_1504517816511_0004 2017-09-22 11:55:27,758 WARN monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(545)) - org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting. 2017-09-22 11:55:28,197 INFO ipc.Server (Server.java:stop(2752)) - Stopping server on 8040 2017-09-22 11:55:28,223 INFO ipc.Server (Server.java:run(1069)) - Stopping IPC Server Responder 2017-09-22 11:55:28,225 INFO ipc.Server (Server.java:run(932)) - Stopping IPC Server listener on 8040 2017-09-22 11:55:28,225 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NodeManager metrics system... 2017-09-22 11:55:28,225 INFO localizer.ResourceLocalizationService (ResourceLocalizationService.java:run(883)) - Public cache exiting 2017-09-22 11:55:28,228 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted. 2017-09-22 11:55:28,355 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NodeManager metrics system stopped. 2017-09-22 11:55:28,356 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NodeManager metrics system shutdown complete. 2017-09-22 11:55:28,391 INFO nodemanager.NodeManager (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ ************************************************************/ Let me know how I could overcome this issue and make the cluster up and running. I have restarted the completed cluster but does not help. In Ambari when I start the service I see the following in the nodemanager restart waiting for a long time: 2017-09-22 16:59:03,828 - Forcefully ensuring existence and permissions of the directory: /var/hadoop/yarn/log
2017-09-22 16:59:03,828 - Directory['/var/hadoop/yarn/log'] {'group': 'hadoop', 'cd_access': 'a', 'create_parents': True, 'ignore_failures': True, 'mode': 0775, 'owner': 'yarn'}
2017-09-22 16:59:03,842 - Host contains mounts: ['/sys', '/proc', '/dev', '/sys/kernel/security', '/dev/shm', '/dev/pts', '/run', '/sys/fs/cgroup', '/sys/fs/cgroup/systemd', '/sys/fs/pstore', '/sys/fs/cgroup/cpuset', '/sys/fs/cgroup/cpu,cpuacct', '/sys/fs/cgroup/memory', '/sys/fs/cgroup/devices', '/sys/fs/cgroup/freezer', '/sys/fs/cgroup/blkio', '/sys/fs/cgroup/perf_event', '/', '/proc/sys/fs/binfmt_misc', '/sys/kernel/debug', '/dev/mqueue', '/var', '/data1', '/net', '/sapmnt/home', '/sapmnt/HOME', '/sapmnt/HOME/d057137'].
2017-09-22 16:59:03,843 - Mount point for directory /var/hadoop/yarn/log is /var
2017-09-22 16:59:03,843 - File['/var/lib/ambari-agent/data/yarn/yarn_log_dir_mount.hist'] {'content': '\n# This file keeps track of the last known mount-point for each dir.\n# It is safe to delete, since it will get regenerated the next time that the component of the service starts.\n# However, it is not advised to delete this file since Ambari may\n# re-create a dir that used to be mounted on a drive but is now mounted on the root.\n# Comments begin with a hash (#) symbol\n# dir,mount_point\n/var/hadoop/yarn/log,/var\n', 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2017-09-22 16:59:03,844 - Directory['/var/lib/ambari-agent/data/yarn'] {'create_parents': True, 'mode': 0755}
2017-09-22 16:59:03,844 - Mount point for directory /data1/hadoop/yarn/local is /data1
2017-09-22 16:59:03,844 - Mount point for directory /data1/hadoop/yarn/local is /data1
2017-09-22 16:59:03,845 - Forcefully ensuring existence and permissions of the directory: /data1/hadoop/yarn/local
2017-09-22 16:59:03,845 - Directory['/data1/hadoop/yarn/local'] {'group': 'hadoop', 'cd_access': 'a', 'recursive_mode_flags': {'d': 'a+rwx', 'f': 'a+rw'}, 'create_parents': True, 'ignore_failures': True, 'mode': 0755, 'owner': 'yarn'} Anish
... View more
Labels:
- Labels:
-
Apache YARN
04-27-2017
05:14 PM
4 Kudos
We have over come the problem by adding
following option to security section in ambari-agent.ini in all the hosts in the cluster: [security]
force_https_protocol=PROTOCOL_TLSv1_2
... View more
04-24-2017
11:35 PM
@ amarnathreddy pappu no ambari sever is not configured for 2 way ssl as the required parameter is not enabled. yes the ambari agent shows exactly what you have mentioned. The command you mentioned shows:- CONNECTED(00000003)
depth=0 C = AU, ST = Some-State, O = Internet Widgits Pty Ltd
verify error:num=18:self signed certificate
verify return:1
depth=0 C = AU, ST = Some-State, O = Internet Widgits Pty Ltd
verify return:1
---
Certificate chain
0 s:/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
i:/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIFpTCCA42gAwIBAgIBATANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJBVTET
MBEGA1UECAwKU29tZS1TdGF0ZTEhMB8GA1UECgwYSW50ZXJuZXQgV2lkZ2l0cyBQ
dHkgTHRkMB4XDTE3MDQyMTE2MTgwOVoXDTE4MDQyMTE2MTgwOVowRTELMAkGA1UE
BhMCQVUxEzARBgNVBAgMClNvbWUtU3RhdGUxITAfBgNVBAoMGEludGVybmV0IFdp
ZGdpdHMgUHR5IEx0ZDCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBANWQ
xlofKWsaR+FtclgHw2Z8fwFNESPdc2Q6l5OTXAkrA4E8gbYBeMySIS4wZIqCrvnt
OmfKZxwGYD/D8YzzGTCBMjY93F/hO9UK5kQGMJp+G4261u9jG+8FfoVF8zFaYr53
+g7YR+l+CfR4to0ZqjYugjWPU02UUabpw3uMpM8HvCYnkyfhhl0qurleC7bll44g
RptALAPwb4FLwmABhygbLAZV4gKHn0ONPhPON6zV2VA9iudUOZl4wi+jQGjjb5TX
SiBqE3Kd9W0ND7t61pER+sla9ASH5OVWZEMVIjnQNIDJ5PHudpA34MiItoR/JaaP
kicUCtoGx8OoCxNMofSB5kLFXH+fcuk7zZlQeeeLFn1qMzDWGBNrKfQKzCJchE6P
OhBArBPk6hZOFLzeqNbYiyD/w7bnXdg7qUwkE+hyu6c0UmdMdqCsmoME/0dAVJOD
poqcuq5DyyQmLluFwRKZ0zlUEkPvK9Ey4l5E18gc+JvcfTlSrNoHYJ/hqRQYMU8B
VRMupECYm6pvqT1CZEHM996gGbrWXjLsgtdGPX1VM0uRwtlGePpvMY6W/HtQoket
XWywiJsaDQWucIxxAh/0JbIiXm5v+bUlj7fYnSOk2i9HI/x/oZh+3zQY6VjLSucd
s2eJH8u4bLazbY3rYB6wCkevtdiZ+IiDqxCOSOxZAgMBAAGjgZ8wgZwwHQYDVR0O
BBYEFK9z9r1rnK9uDkiZD6jWnTCHxWPdMG0GA1UdIwRmMGSAFK9z9r1rnK9uDkiZ
D6jWnTCHxWPdoUmkRzBFMQswCQYDVQQGEwJBVTETMBEGA1UECAwKU29tZS1TdGF0
ZTEhMB8GA1UECgwYSW50ZXJuZXQgV2lkZ2l0cyBQdHkgTHRkggEBMAwGA1UdEwQF
MAMBAf8wDQYJKoZIhvcNAQELBQADggIBAMZgMZPsqgRWU8nWGMbQl6kPrjo758Yw
QMDD+O1B0pD57BZqcDEAHAmP0v1Am6DcGyRvWzwhBzRoT8VeNJKdyROQGhMXPWbC
/E5kvBX6VxaetII9VgyOIUjizC/HKdS24PVu8sK6y7h0CNmmtUJj4P25SaOY7g2y -----END CERTIFICATE-----
subject=/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
issuer=/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
---
No client certificate CA names sent
---
SSL handshake has read 2257 bytes and written 455 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 4096 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES256-GCM-SHA384
Session-ID: 58FE6DA17EFA5278E0381D826F3E7E7E3F6558A6D4683964ACFDF4B4C63AD632
Session-ID-ctx:
Master-Key: C0EEC8877A651977C8F5B6FCC78B4FD977DDA0A7BF06203DE433D04EC4B45A1788F8802B7F47AF58C210C321DD9BD225
Key-Arg : None
PSK identity: None
PSK identity hint: None
SRP username: None
Start Time: 1493069217
Timeout : 300 (sec)
Verify return code: 18 (self signed certificate)
---
closed
... View more
04-24-2017
11:35 PM
@amarnathreddy pappu Ambari server is not configured for 2 way ssl. [server]
hostname=ls5387v7.XXX.XXX.corp url_port=8440 secured_url_port=8441 connect_retry_delay=10 max_reconnect_retry_delay=30 ls5387v8:~ # openssl s_client -connect ls5387v7.XXX.XXX.corp:8440 CONNECTED(00000003)
depth=0 C = AU, ST = Some-State, O = Internet Widgits Pty Ltd
verify error:num=18:self signed certificate
verify return:1
depth=0 C = AU, ST = Some-State, O = Internet Widgits Pty Ltd
verify return:1
---
Certificate chain
0 s:/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
i:/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIFpTCCA42gAwIBAgIBATANBgkqhkiG9w0BAQsFADBFMQswCQYDVQQGEwJBVTET
MBEGA1UECAwKU29tZS1TdGF0ZTEhMB8GA1UECgwYSW50ZXJuZXQgV2lkZ2l0cyBQ
dHkgTHRkMB4XDTE3MDQyMTE2MTgwOVoXDTE4MDQyMTE2MTgwOVowRTELMAkGA1UE
BhMCQVUxEzARBgNVBAgMClNvbWUtU3RhdGUxITAfBgNVBAoMGEludGVybmV0IFdp
ZGdpdHMgUHR5IEx0ZDCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBANWQ
xlofKWsaR+FtclgHw2Z8fwFNESPdc2Q6l5OTXAkrA4E8gbYBeMySIS4wZIqCrvnt
OmfKZxwGYD/D8YzzGTCBMjY93F/hO9UK5kQGMJp+G4261u9jG+8FfoVF8zFaYr53
+g7YR+l+CfR4to0ZqjYugjWPU02UUabpw3uMpM8HvCYnkyfhhl0qurleC7bll44g
RptALAPwb4FLwmABhygbLAZV4gKHn0ONPhPON6zV2VA9iudUOZl4wi+jQGjjb5TX
SiBqE3Kd9W0ND7t61pER+sla9ASH5OVWZEMVIjnQNIDJ5PHudpA34MiItoR/JaaP
kicUCtoGx8OoCxNMofSB5kLFXH+fcuk7zZlQeeeLFn1qMzDWGBNrKfQKzCJchE6P
OhBArBPk6hZOFLzeqNbYiyD/w7bnXdg7qUwkE+hyu6c0UmdMdqCsmoME/0dAVJOD
poqcuq5DyyQmLluFwRKZ0zlUEkPvK9Ey4l5E18gc+JvcfTlSrNoHYJ/hqRQYMU8B
VRMupECYm6pvqT1CZEHM996gGbrWXjLsgtdGPX1VM0uRwtlGePpvMY6W/HtQoket
XWywiJsaDQWucIxxAh/0JbIiXm5v+bUlj7fYnSOk2i9HI/x/oZh+3zQY6VjLSucd
s2eJH8u4bLazbY3rYB6wCkevtdiZ+IiDqxCOSOxZAgMBAAGjgZ8wgZwwHQYDVR0O
BBYEFK9z9r1rnK9uDkiZD6jWnTCHxWPdMG0GA1UdIwRmMGSAFK9z9r1rnK9uDkiZ
D6jWnTCHxWPdoUmkRzBFMQswCQYDVQQGEwJBVTETMBEGA1UECAwKU29tZS1TdGF0
ZTEhMB8GA1UECgwYSW50ZXJuZXQgV2lkZ2l0cyBQdHkgTHRkggEBMAwGA1UdEwQF
MAMBAf8wDQYJKoZIhvcNAQELBQADggIBAMZgMZPsqgRWU8nWGMbQl6kPrjo758Yw
QMDD+O1B0pD57BZqcDEAHAmP0v1Am6DcGyRvWzwhBzRoT8VeNJKdyROQGhMXPWbC
/E5kvBX6VxaetII9VgyOIUjizC/HKdS24PVu8sK6y7h0CNmmtUJj4P25SaOY7g2y
A1CIW8Jny2XJj4O8re3YiCfZn2TKzXHZJgWBiV5lVgczeuxBffLDsUU2txHxANlo
RahS+3H6KwDFxfGXiuolu+lKdydXVy4jCqM97vNJGZ+tbB6RhoyuhCXd8lpW8xp7
BY3GmrMbIS/vFNoK+iVHpcxt6AfIJqUZ8KW97SqfZTXymIYzGva8/7XY0tNYIh1i
Hr3hC+3GoFSpfDSjLIu2i6+3vUIaykAdO01zJ9ccYYoLY7G6rHz4ErjWTu9Nh51u
olE4QgDlW19lMgTIOZk6a/jPYq6zc4iAppTqMXdvHUB3W96ceDoeMq+0P2J3UrI6
11OJUrNBvxEQgrYWgH83au1v1u8rYxo+IA0jQsBVaMeOQTShOSttuGsNv/zhjSf6
0wLK09qmayuZddZhJTEHwEpJ4OdQVNnvzO/e9QYnzxqa3XU/rrZ9xihNlU+1YZt5
0vSTjuhD4ylFpR9JhmX1VB/DTbDS0trfdH1VPhAMKYr/v4GkGTtn8eHe3vwmBH9k
79jAP0ApRatK
-----END CERTIFICATE-----
subject=/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
issuer=/C=AU/ST=Some-State/O=Internet Widgits Pty Ltd
---
No client certificate CA names sent
---
SSL handshake has read 2257 bytes and written 455 bytes
---
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 4096 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES256-GCM-SHA384
Session-ID: 58FE6DA17EFA5278E0381D826F3E7E7E3F6558A6D4683964ACFDF4B4C63AD632
Session-ID-ctx:
Master-Key: C0EEC8877A651977C8F5B6FCC78B4FD977DDA0A7BF06203DE433D04EC4B45A1788F8802B7F47AF58C210C321DD9BD225
Key-Arg : None
PSK identity: None
PSK identity hint: None
SRP username: None
Start Time: 1493069217
Timeout : 300 (sec)
Verify return code: 18 (self signed certificate)
---
... View more
04-24-2017
11:35 PM
,@amarnathreddy pappu I have checked that the Ambari server is not configured for 2 way ssl. [server] hostname=ls5387v7.XXX.XXX.corp url_port=8440 secured_url_port=8441 connect_retry_delay=10 max_reconnect_retry_delay=30
... View more
04-24-2017
08:37 PM
@amarnathreddy pappu How do I check if the ambari server is running 2way ssl? Secondly do you really think this is the problem because i I look in the security guide : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_security/bk_security.pdf The below part tell hows to setup the 2 way ssl and which not enabled by default in my case as I don't have the parameter set. 2.5.4. Optional: Set Up Two-Way SSL Between Ambari Server and Ambari Agents
... View more
04-24-2017
01:21 PM
Please note it is ambari 2.5.0.3 INFO 2017-04-22 14:19:43,218 NetUtil.py:67 - Connecting to https://ls5387v7.wdf.sap.corp:8440/connection_info
INFO 2017-04-22 14:19:43,322 security.py:93 - SSL Connect being called.. connecting to the server
ERROR 2017-04-22 14:19:43,329 Controller.py:226 - Unable to connect to: https://ls5387v7.wdf.sap.corp:8440/connection_info
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 175, in registerWithServer
ret = self.sendRequest(self.registerUrl, data)
File "/usr/lib/python2.6/site-packages/ambari_agent/Controller.py", line 545, in sendRequest
raise IOError('Request to {0} failed due to {1}'.format(url, str(exception)))
IOError: Request to https://ls5387v7.wdf.sap.corp:8440/connection_info failed due to EOF occurred in violation of protocol (_ssl.c:661)
ERROR 2017-04-22 14:19:43,330 Controller.py:227 - Error:Request to https://ls5387v7.wdf.sap.corp:8440/connection_info failed due to EOF occurred in violation of protocol (_ssl.c:661)
WARNING 2017-04-22 14:19:43,330 Controller.py:228 - Sleeping for 22 seconds and then trying again
... View more