Support Questions
Find answers, ask questions, and share your expertise

Storm, Kafka and ambari-metrics-collector not running in kerberized environment [all three services fails to connect to zkclient]

I have kerberized HDP [2.4] cluster using Ambari Rest-API and all services are running fine except Storm, Kafka and ambari-metrics-collector. All keytabs are available and properly placed on respective hosts

From logs what I can understand is Storm, Kafka and ambari-metrics-collector services fail to connect to zkclient or zk quorum.

All zookeeper servers are running fine for a long time and if I do telnet, I am able to connect zk quorum with same port [2181]. So, somewhere I am missing some configuration for these services connecting zookeeper in kerberized environment. [or SASL configurations].

Zookeeper logs

2017-09-21 12:14:47,271 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /128.160.120.21:41906 (no session established for client)
2017-09-21 12:15:43,963 - INFO  [ProcessThread(sid:1 cport:-1)::PrepRequestProcessor@494] - Processed session termination for sessionid: 0x35ea3480411005c
2017-09-21 12:15:47,309 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /128.160.120.21:42000
2017-09-21 12:15:47,310 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
  at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
  at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
  at java.lang.Thread.run(Thread.java:748)
2017-09-21 12:15:47,310 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /128.160.120.21:42000 (no session established for client)
2017-09-21 12:16:47,273 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /128.160.120.21:42122
2017-09-21 12:16:47,275 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket
  at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
  at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
  at java.lang.Thread.run(Thread.java:748)
2017-09-21 12:16:47,275 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /128.160.120.21:42122 (no session established for client)
~

Kafka server logs

  advertised.listeners = PLAINTEXTSASL://abctestlab0515.bdaas.com:6667
  leader.imbalance.per.broker.percentage = 10
 (kafka.server.KafkaConfig)
[2017-09-21 12:07:30,276] INFO starting (kafka.server.KafkaServer)
[2017-09-21 12:07:30,291] INFO Connecting to zookeeper on abctestlab0512.bdaas.com:2181,abctestlab0515.bdaas.com:2181,abctestlab0513.bdaas.com:2181 (kafka.server.KafkaServer)
[2017-09-21 12:11:40,363] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 250000
  at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1223)
  at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:155)
  at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:129)
  at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:89)
  at kafka.utils.ZkUtils$.apply(ZkUtils.scala:71)
  at kafka.server.KafkaServer.initZk(KafkaServer.scala:278)
  at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
  at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
  at kafka.Kafka$.main(Kafka.scala:67)
  at kafka.Kafka.main(Kafka.scala)
[2017-09-21 12:11:40,364] INFO shutting down (kafka.server.KafkaServer)
[2017-09-21 12:11:40,370] INFO shut down completed (kafka.server.KafkaServer)
[2017-09-21 12:11:40,370] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 250000
  at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1223)
  at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:155)
  at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:129)
  at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:89)
  at kafka.utils.ZkUtils$.apply(ZkUtils.scala:71)
  at kafka.server.KafkaServer.initZk(KafkaServer.scala:278)
  at kafka.server.KafkaServer.startup(KafkaServer.scala:168)
  at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
  at kafka.Kafka$.main(Kafka.scala:67)
  at kafka.Kafka.main(Kafka.scala)
[2017-09-21 12:11:40,372] INFO shutting down (kafka.server.KafkaServer)

Storm- DRPC logs

2017-09-20 14:21:06.114 o.a.s.z.s.ZooKeeperServer [INFO] Server environment:user.dir=/home/hdp44-storm
2017-09-20 14:21:07.304 b.s.u.Utils [INFO] Using defaults.yaml from resources
2017-09-20 14:21:07.324 b.s.u.Utils [INFO] Using storm.yaml from resources
2017-09-20 14:21:07.373 b.s.d.drpc [INFO] Starting Distributed RPC servers...
2017-09-20 14:21:07.450 b.s.s.a.k.ServerCallbackHandler [WARN] No password found for user: null
2017-09-20 14:21:07.452 b.s.s.a.k.KerberosSaslTransportPlugin [ERROR] Server failed to login in principal:javax.security.auth.login.LoginException: No pa
ssword pr
ovided

javax.security.auth.login.LoginException: No password provided
  at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:919) ~[?:1.8.0_131]
  at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:760) ~[?:1.8.0_131]
  at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) ~[?:1.8.0_131]
3 REPLIES 3

Cloudera Employee

Can you ensure if you have Successfully completed all the Steps for Securing the Cluster using Kerberos.

I guess you must have missed configuring Storm and Kafka for Kerberos.

Please refer to the following URL to secure the Services.

Hi @Shahrukh Khan

Thanks for your reply. After referring above mentioned documents I am able to start my Storm and Kafka services.The issues were solved by properly configuring the jaas files.

But, Ambari metrics collector still does not start and Hbase master stops soon after starting.

For metrics collector we get following logs repeatedly:

/var/log/ambari-metrics-collector/ambari-metrics-collector.log

2017-09-27 09:43:00,001 ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: RECEIVED SIGNAL 15: SIGTERM (not sure what this error indicates)

2017-09-27 09:47:40,572 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=21, retries=35, started=270488 ms ago, cancelled=false, msg=

2017-09-27 09:48:00,659 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=22, retries=35, started=290575 ms ago, cancelled=false, msg=

2017-09-27 09:48:20,693 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=310609 ms ago, cancelled=false, msg=

/var/log/ambari-metrics-collector/hbase-master.log

2017-09-27 09:43:06,975 ERROR [main] master.HMasterCommandLine: Master exiting java.io.IOException: Could not start ZK with 3 ZK servers in local mode deployment. Aborting as clients (e.g. shell) will not be able to find this ZK quorum. at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:175) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:139) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2451)

Cloudera Employee

@Ajit Sonawane Great!! Good to hear that the documentation helped in solving the Issue.

For the HBase one please create a seperate question

; ;