About mahesh454347

mahesh454347 · ‎06-26-2018

Hi @dbains, I Checked the kafka broker port on my HDP cluster and it was 6667. Although I was confused between Bootstrap Server and Kafka Broker, I was giving zookeeper_ip:2181. It worked after passing ip-address:6667.

mahesh454347 · ‎06-22-2018

While running "structured_kafka_wordcount.py" example given in "https://github.com/apache/spark/blob/v2.3.1/examples/src/main/python/sql/streaming/structured_kafka_wordcount.py" I got following error: "WARN NetworkClient: Bootstrap broker ip-10-28-3-35.ec2.internal:2181 disconnected" I was able to read the content of a topic from kafka as given it the example https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/kafka_wordcount.py I submitted these job with command "bin/spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1 examples/src/main/python/sql/streaming/structured_kafka_wordcount.py ip-10-28-3-35.ec2.internal:2181 subscribe fifa2" <code>[root@centos spark2]# bin/spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1 examples/src/main/python/sql/streaming/structured_kafka_wordcount.py ip-10-28-3-35.ec2.internal:2181 subscribe fifa2 Ivy Default Cache set to: /root/.ivy2/cache The jars for the packages stored in: /root/.ivy2/jars :: loading settings :: url = jar:file:/usr/hdp/2.6.5.0-292/spark2/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml org.apache.spark#spark-sql-kafka-0-10_2.11 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] found org.apache.spark#spark-sql-kafka-0-10_2.11;2.3.1 in central found org.apache.kafka#kafka-clients;0.10.0.1 in central found net.jpountz.lz4#lz4;1.3.0 in central found org.xerial.snappy#snappy-java;1.1.2.6 in central found org.slf4j#slf4j-api;1.7.16 in central found org.spark-project.spark#unused;1.0.0 in central :: resolution report :: resolve 642ms :: artifacts dl 15ms :: modules in use: net.jpountz.lz4#lz4;1.3.0 from central in [default] org.apache.kafka#kafka-clients;0.10.0.1 from central in [default] org.apache.spark#spark-sql-kafka-0-10_2.11;2.3.1 from central in [default] org.slf4j#slf4j-api;1.7.16 from central in [default] org.spark-project.spark#unused;1.0.0 from central in [default] org.xerial.snappy#snappy-java;1.1.2.6 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 6 | 0 | 0 | 0 || 6 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0 artifacts copied, 6 already retrieved (0kB/17ms) 18/06/22 06:57:26 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 18/06/22 06:57:26 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042. 18/06/22 06:57:35 WARN NetworkClient: Bootstrap broker ip-10-28-3-35.ec2.internal:2181 disconnected 18/06/22 06:57:35 WARN NetworkClient: Bootstrap broker ip-10-28-3-35.ec2.internal:2181 disconnected 18/06/22 06:57:35 WARN NetworkClient: Bootstrap broker ip-10-28-3-35.ec2.internal:2181 disconnected 18/06/22 06:57:35 WARN NetworkClient: Bootstrap broker ip-10-28-3-35.ec2.internal:2181 disconnected ^Z [11]+ Stopped bin/spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1 examples/src/main/python/sql/streaming/structured_kafka_wordcount.py ip-10-28-3-35.ec2.internal:2181 subscribe fifa2

mahesh454347 · ‎06-22-2018

@Felix Albani I Checked connecting hdf cluster by telnet and was not able to connect it. The error was coming because of security group, as 6667 and 2181 port were not open for communicate with another cluster

mahesh454347 · ‎06-20-2018

I'm trying to Execute the Example program given in Spark Directory on HDP cluster "/spark2/examples/src/main/python/streaming/kafka_wordcount.py" which tries to read kafka topic but gives Zookeeper server timeout error. Spark is installed on HDP Cluster and Kafka is running on HDF Cluster, both are running on different cluster and are in same VPC on AWS Command executed to run spark example on HDP cluster is "bin/spark-submit --jars spark-streaming-kafka-0-8-assembly_2.11-2.3.0.jar examples/src/main/python/streaming/kafka_wordcount.py HDF-cluster-ip-address:2181 topic" <code>------------------------------------------- Time: 2018-06-20 07:51:56 ------------------------------------------- 18/06/20 07:51:56 INFO JobScheduler: Finished job streaming job 1529481116000 ms.0 from job set of time 1529481116000 ms 18/06/20 07:51:56 INFO JobScheduler: Total delay: 0.171 s for time 1529481116000 ms (execution: 0.145 s) 18/06/20 07:51:56 INFO PythonRDD: Removing RDD 94 from persistence list 18/06/20 07:51:56 INFO BlockManager: Removing RDD 94 18/06/20 07:51:56 INFO BlockRDD: Removing RDD 89 from persistence list 18/06/20 07:51:56 INFO BlockManager: Removing RDD 89 18/06/20 07:51:56 INFO KafkaInputDStream: Removing blocks of RDD BlockRDD[89] at createStream at NativeMethodAccessorImpl.java:0 of time 1529481116000 ms 18/06/20 07:51:56 INFO ReceivedBlockTracker: Deleting batches: 1529481114000 ms 18/06/20 07:51:56 INFO InputInfoTracker: remove old batch metadata: 1529481114000 ms 18/06/20 07:51:57 INFO JobScheduler: Added jobs for time 1529481117000 ms 18/06/20 07:51:57 INFO JobScheduler: Starting job streaming job 1529481117000 ms.0 from job set of time 1529481117000 ms 18/06/20 07:51:57 INFO SparkContext: Starting job: runJob at PythonRDD.scala:141 18/06/20 07:51:57 INFO DAGScheduler: Registering RDD 107 (call at /usr/hdp/2.6.5.0-292/spark2/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py:2257) 18/06/20 07:51:57 INFO DAGScheduler: Got job 27 (runJob at PythonRDD.scala:141) with 1 output partitions 18/06/20 07:51:57 INFO DAGScheduler: Final stage: ResultStage 54 (runJob at PythonRDD.scala:141) 18/06/20 07:51:57 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 53) 18/06/20 07:51:57 INFO DAGScheduler: Missing parents: List() 18/06/20 07:51:57 INFO DAGScheduler: Submitting ResultStage 54 (PythonRDD[111] at RDD at PythonRDD.scala:48), which has no missing parents 18/06/20 07:51:57 INFO MemoryStore: Block broadcast_27 stored as values in memory (estimated size 7.0 KB, free 366.0 MB) 18/06/20 07:51:57 INFO MemoryStore: Block broadcast_27_piece0 stored as bytes in memory (estimated size 4.1 KB, free 366.0 MB) 18/06/20 07:51:57 INFO BlockManagerInfo: Added broadcast_27_piece0 in memory on ip-10-29-3-74.ec2.internal:46231 (size: 4.1 KB, free: 366.2 MB) 18/06/20 07:51:57 INFO SparkContext: Created broadcast 27 from broadcast at DAGScheduler.scala:1039 18/06/20 07:51:57 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 54 (PythonRDD[111] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(0)) 18/06/20 07:51:57 INFO TaskSchedulerImpl: Adding task set 54.0 with 1 tasks 18/06/20 07:51:57 INFO TaskSetManager: Starting task 0.0 in stage 54.0 (TID 53, localhost, executor driver, partition 0, PROCESS_LOCAL, 7649 bytes) 18/06/20 07:51:57 INFO Executor: Running task 0.0 in stage 54.0 (TID 53) 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -881, init = 921, finish = 0 18/06/20 07:51:57 INFO PythonRunner: Times: total = 41, boot = -881, init = 922, finish = 0 18/06/20 07:51:57 INFO Executor: Finished task 0.0 in stage 54.0 (TID 53). 1493 bytes result sent to driver 18/06/20 07:51:57 INFO TaskSetManager: Finished task 0.0 in stage 54.0 (TID 53) in 48 ms on localhost (executor driver) (1/1) 18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 54.0, whose tasks have all completed, from pool 18/06/20 07:51:57 INFO DAGScheduler: ResultStage 54 (runJob at PythonRDD.scala:141) finished in 0.055 s 18/06/20 07:51:57 INFO DAGScheduler: Job 27 finished: runJob at PythonRDD.scala:141, took 0.058062 s 18/06/20 07:51:57 INFO ZooKeeper: Session: 0x0 closed 18/06/20 07:51:57 INFO SparkContext: Starting job: runJob at PythonRDD.scala:141 18/06/20 07:51:57 INFO DAGScheduler: Got job 28 (runJob at PythonRDD.scala:141) with 3 output partitions 18/06/20 07:51:57 INFO DAGScheduler: Final stage: ResultStage 56 (runJob at PythonRDD.scala:141) 18/06/20 07:51:57 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 55) 18/06/20 07:51:57 INFO DAGScheduler: Missing parents: List() 18/06/20 07:51:57 INFO DAGScheduler: Submitting ResultStage 56 (PythonRDD[112] at RDD at PythonRDD.scala:48), which has no missing parents 18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Stopping receiver with message: Error starting receiver 0: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Called receiver onStop 18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Deregistering receiver 0 18/06/20 07:51:57 INFO MemoryStore: Block broadcast_28 stored as values in memory (estimated size 7.0 KB, free 365.9 MB) 18/06/20 07:51:57 INFO MemoryStore: Block broadcast_28_piece0 stored as bytes in memory (estimated size 4.1 KB, free 365.9 MB) 18/06/20 07:51:57 INFO ClientCnxn: EventThread shut down 18/06/20 07:51:57 INFO BlockManagerInfo: Added broadcast_28_piece0 in memory on ip-10-29-3-74.ec2.internal:46231 (size: 4.1 KB, free: 366.2 MB) 18/06/20 07:51:57 INFO SparkContext: Created broadcast 28 from broadcast at DAGScheduler.scala:1039 18/06/20 07:51:57 INFO DAGScheduler: Submitting 3 missing tasks from ResultStage 56 (PythonRDD[112] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(1, 2, 3)) 18/06/20 07:51:57 INFO TaskSchedulerImpl: Adding task set 56.0 with 3 tasks 18/06/20 07:51:57 INFO TaskSetManager: Starting task 0.0 in stage 56.0 (TID 54, localhost, executor driver, partition 1, PROCESS_LOCAL, 7649 bytes) 18/06/20 07:51:57 INFO TaskSetManager: Starting task 1.0 in stage 56.0 (TID 55, localhost, executor driver, partition 2, PROCESS_LOCAL, 7649 bytes) 18/06/20 07:51:57 INFO TaskSetManager: Starting task 2.0 in stage 56.0 (TID 56, localhost, executor driver, partition 3, PROCESS_LOCAL, 7649 bytes) 18/06/20 07:51:57 INFO Executor: Running task 1.0 in stage 56.0 (TID 55) 18/06/20 07:51:57 INFO Executor: Running task 2.0 in stage 56.0 (TID 56) 18/06/20 07:51:57 INFO Executor: Running task 0.0 in stage 56.0 (TID 54) 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 0 blocks 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 18/06/20 07:51:57 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 18/06/20 07:51:57 ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Stopped receiver 0 18/06/20 07:51:57 INFO BlockGenerator: Stopping BlockGenerator 18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -947, init = 987, finish = 0 18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -947, init = 987, finish = 0 18/06/20 07:51:57 INFO PythonRunner: Times: total = 41, boot = -944, init = 985, finish = 0 18/06/20 07:51:57 INFO Executor: Finished task 1.0 in stage 56.0 (TID 55). 1536 bytes result sent to driver 18/06/20 07:51:57 INFO TaskSetManager: Finished task 1.0 in stage 56.0 (TID 55) in 52 ms on localhost (executor driver) (1/3) 18/06/20 07:51:57 INFO PythonRunner: Times: total = 45, boot = -944, init = 989, finish = 0 18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -32, init = 72, finish = 0 18/06/20 07:51:57 INFO Executor: Finished task 0.0 in stage 56.0 (TID 54). 1536 bytes result sent to driver 18/06/20 07:51:57 INFO TaskSetManager: Finished task 0.0 in stage 56.0 (TID 54) in 56 ms on localhost (executor driver) (2/3) 18/06/20 07:51:57 INFO PythonRunner: Times: total = 40, boot = -33, init = 73, finish = 0 18/06/20 07:51:57 INFO Executor: Finished task 2.0 in stage 56.0 (TID 56). 1536 bytes result sent to driver 18/06/20 07:51:57 INFO TaskSetManager: Finished task 2.0 in stage 56.0 (TID 56) in 58 ms on localhost (executor driver) (3/3) 18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 56.0, whose tasks have all completed, from pool 18/06/20 07:51:57 INFO DAGScheduler: ResultStage 56 (runJob at PythonRDD.scala:141) finished in 0.063 s 18/06/20 07:51:57 INFO DAGScheduler: Job 28 finished: runJob at PythonRDD.scala:141, took 0.065728 s ------------------------------------------- Time: 2018-06-20 07:51:57 ------------------------------------------- 18/06/20 07:51:57 INFO JobScheduler: Finished job streaming job 1529481117000 ms.0 from job set of time 1529481117000 ms 18/06/20 07:51:57 INFO JobScheduler: Total delay: 0.169 s for time 1529481117000 ms (execution: 0.149 s) 18/06/20 07:51:57 INFO PythonRDD: Removing RDD 102 from persistence list 18/06/20 07:51:57 INFO BlockManager: Removing RDD 102 18/06/20 07:51:57 INFO BlockRDD: Removing RDD 97 from persistence list 18/06/20 07:51:57 INFO KafkaInputDStream: Removing blocks of RDD BlockRDD[97] at createStream at NativeMethodAccessorImpl.java:0 of time 1529481117000 ms 18/06/20 07:51:57 INFO BlockManager: Removing RDD 97 18/06/20 07:51:57 INFO ReceivedBlockTracker: Deleting batches: 1529481115000 ms 18/06/20 07:51:57 INFO InputInfoTracker: remove old batch metadata: 1529481115000 ms 18/06/20 07:51:57 INFO RecurringTimer: Stopped timer for BlockGenerator after time 1529481117400 18/06/20 07:51:57 INFO BlockGenerator: Waiting for block pushing thread to terminate 18/06/20 07:51:57 INFO BlockGenerator: Pushing out the last 0 blocks 18/06/20 07:51:57 INFO BlockGenerator: Stopped block pushing thread 18/06/20 07:51:57 INFO BlockGenerator: Stopped BlockGenerator 18/06/20 07:51:57 INFO ReceiverSupervisorImpl: Waiting for receiver to be stopped 18/06/20 07:51:57 ERROR ReceiverSupervisorImpl: Stopped receiver with error: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 18/06/20 07:51:57 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 18/06/20 07:51:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 18/06/20 07:51:57 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job 18/06/20 07:51:57 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 18/06/20 07:51:57 INFO TaskSchedulerImpl: Cancelling stage 0 18/06/20 07:51:57 INFO DAGScheduler: ResultStage 0 (start at NativeMethodAccessorImpl.java:0) failed in 13.256 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: 18/06/20 07:51:57 ERROR ReceiverTracker: Receiver has been stopped. Try to restart it. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$failJobAndIndependentStages(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGScheduler$anonfun$abortStage$1.apply(DAGScheduler.scala:1587) at org.apache.spark.scheduler.DAGScheduler$anonfun$abortStage$1.apply(DAGScheduler.scala:1586) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1586) at org.apache.spark.scheduler.DAGScheduler$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGScheduler$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1820) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1769) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1758) at org.apache.spark.util.EventLoop$anon$1.run(EventLoop.scala:48) Caused by: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:171) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:126) at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:143) at kafka.consumer.Consumer$.create(ConsumerConnector.scala:94) at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.SparkContext$anonfun$34.apply(SparkContext.scala:2185) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

mahesh454347 · ‎12-21-2017

Hi Robert Levas, Thanks for these help. I used to download kerberos.csv file in previous version but now i'm facing issue. I used to obtain the Kerberos.csv file using command " GET /api/v1/clusters/:cluster_name/kerberos_identities?fields=*&format=CSV " as mentioned by you in previous version of HDP i.e. HDP 2.4, But I'm not able to download it with HDP 2.5.3 and Amabri server 2.2. Is there any change in path of kerberos_identities. I have attached the screenshot where one can see that an empty file has downloaded with no Principals and keytabs.

mahesh454347 · ‎09-28-2017

Hi @Jeff Arnold, I tried to start the failed namenode on standbynamenode with above steps. I faced some error on running these command "sudo -u hdfs hdfs namenode -bootstrapStandby -force" Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x000000008c800000, 1937768448, 0) failed; error='Cannot allocate memory' (errno=12) # # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 1937768448 bytes for committing reserved memory. # An error report file with more information is saved as: # /var/log/hadoop/hdfs/hs_err_pid5144.log Before executing the steps that you provided, I was facing these error while restarting namenode on standbynamenode via Ambari: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 408, in <module> NameNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 103, in start upgrade_suspended=params.upgrade_suspended, env=env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 118, in namenode raise Fail("Could not bootstrap standby namenode") resource_management.core.exceptions.Fail: Could not bootstrap standby namenode

Online	Offline
Last Visited	‎04-02-2019 06:56 AM

Member Since	‎06-16-2017 09:46 AM
Last Visited	‎04-02-2019 06:56 AM
Posts	14

Cloudera Community

Re: Error while running structured streaming examp...

Error while running structured streaming example “...

Re: Spark unable to read kafka Topic and gives err...

Spark unable to read kafka Topic and gives error "...

Re: Obtaining a listing of expected Kerberos Ident...

Re: Unable to restrat standby Namenode