Member since
03-22-2018
27
Posts
0
Kudos Received
0
Solutions
10-05-2018
02:25 PM
@Aditya Sirna Please see the log below. $ tail -f hadoop-hdfs-datanode-HKLPADBID04.log 2018-10-05 21:50:01,820 INFO datanode.DataNode (BPServiceActor.java:register(734)) - Block pool Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 successfully registered with NN 2018-10-05 21:50:01,820 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:addKeys(193)) - Setting block keys
2018-10-05 21:50:04,718 INFO datanode.DataNode (BPOfferService.java:processCommandFromActor(609)) - DatanodeCommand action : DNA_REGISTER from hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 with standby state 2018-10-05 21:50:04,718 INFO datanode.DataNode (BPServiceActor.java:register(715)) - Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 beginning handshake with NN 2018-10-05 21:50:04,719 INFO datanode.DataNode (BPServiceActor.java:register(734)) - Block pool Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 successfully registered with NN 2018-10-05 21:50:04,719 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:addKeys(193)) - Setting block keys
2018-10-05 21:50:04,821 INFO datanode.DataNode (BPOfferService.java:processCommandFromActor(609)) - DatanodeCommand action : DNA_REGISTER from hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 with active state 2018-10-05 21:50:04,821 INFO datanode.DataNode (BPServiceActor.java:register(715)) - Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 beginning handshake with NN 2018-10-05 21:50:04,822 INFO datanode.DataNode (BPServiceActor.java:register(734)) - Block pool Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 successfully registered with NN 2018-10-05 21:50:04,822 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:addKeys(193)) - Setting block keys
2018-10-05 21:50:07,720 INFO datanode.DataNode (BPOfferService.java:processCommandFromActor(609)) - DatanodeCommand action : DNA_REGISTER from hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 with standby state 2018-10-05 21:50:07,720 INFO datanode.DataNode (BPServiceActor.java:register(715)) - Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 beginning handshake with NN 2018-10-05 21:50:07,721 INFO datanode.DataNode (BPServiceActor.java:register(734)) - Block pool Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid03.hk.standardchartered.com/192.168.22.19:8020 successfully registered with NN 2018-10-05 21:50:07,721 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:addKeys(193)) - Setting block keys
2018-10-05 21:50:07,823 INFO datanode.DataNode (BPOfferService.java:processCommandFromActor(609)) - DatanodeCommand action : DNA_REGISTER from hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 with active state 2018-10-05 21:50:07,823 INFO datanode.DataNode (BPServiceActor.java:register(715)) - Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 beginning handshake with NN 2018-10-05 21:50:07,824 INFO datanode.DataNode (BPServiceActor.java:register(734)) - Block pool Block pool BP-1265401458-192.168.22.18-1538217251820 (Datanode Uuid 9ba5d3f6-bd76-436f-a52c-bb4bc6c3d970) service to hklpadbid02.hk.standardchartered.com/192.168.22.18:8020 successfully registered with NN 2018-10-05 21:50:07,824 INFO block.BlockTokenSecretManager (BlockTokenSecretManager.java:addKeys(193)) - Setting block keys
... View more
10-05-2018
12:10 PM
Hi All, I've setup cluster newly and installed datanode in slave servers. After starting up HDFS cluster fully, I see only one datanode is live out of 3 nodes. I checked cluster ID, datanode Uuid, pool ID are same in each datanode. Please help to resolve this issue. Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Hadoop
10-03-2018
05:31 PM
@Aditya Sirna Thanks for your reply. there is no such directory inside /usr/hdp/. Does it refer the old version in cache. hdp]$ ls -lrt total 12 drwxr-xr-x 3 hadoop hadoop 4096 Sep 22 19:56 share drwxr-xr-x 14 root root 4096 Sep 28 21:15 2.6.4.25-1 drwxr-xr-x 2 hadoop hadoop 4096 Oct 3 18:12 current Thanks.
... View more
10-03-2018
01:35 PM
@Aditya Sirna Great 🙂 this worked for me. Not sure where I was stuck up, still I want to know why it was referring previous version. By the way, thanks for your support which really appreciated. Thanks.
... View more
10-03-2018
11:27 AM
@Aditya Sirna There is no such dir inside /usr/hdp/. Does it refer the old version somewhere from cache. $ ls -la /usr/hdp/ total 20 drwxr-xr-x 5 root root 4096 Sep 22 19:54 . drwxr-xr-x. 18 root root 4096 Apr 14 16:34 .. drwxr-xr-x 14 root root 4096 Sep 28 21:15 2.6.4.25-1 drwxr-xr-x 2 hadoop hadoop 4096 Oct 3 18:12 current drwxr-xr-x 3 hadoop hadoop 4096 Sep 22 19:56 share
... View more
10-03-2018
09:55 AM
@Geoffrey Shelton Okot I've tried as you suggested but still the same error persist while installing NN in new server. Please could you let me know from which location its referring the old version. Thanks.
... View more
09-28-2018
04:43 PM
Hello everyone I'm getting below error while configuring Namenode HA. Please help to fix this. I'm using hdp-select version as 2.6.4.25 but its referring 2.6.1.0 (stange). Please let me know from where its refering the old version. : /var/lib/ambari-agent/data/errors-216.txt Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/after-INSTALL/scripts/hook.py", line 37, in <module>
AfterInstallHook().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 375, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/after-INSTALL/scripts/hook.py", line 31, in hook
setup_stack_symlinks(self.stroutfile)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/after-INSTALL/scripts/shared_initialization.py", line 62, in setup_stack_symlinks
stack_select.select(package, json_version)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 313, in select
Execute(command, sudo=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 262, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 303, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'ambari-python-wrap /usr/bin/hdp-select set hadoop-hdfs-namenode 2.6.1.0-129' returned 1. ERROR: Invalid version 2.6.1.0-129
Valid choices:
2.6.4.25-1
Thanks.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN
06-27-2018
01:31 PM
@Geoffrey Shelton Okot I see the port mentioned in config is correct and how to check the port in java code.
... View more
06-27-2018
09:56 AM
@Geoffrey Shelton Okot @adash Seeking your help to fix the issue which mentioned above. Thanks.
... View more
06-26-2018
10:39 AM
Hi, We're facing an issue with Spark in Production environments that the spark workers does not get connected with Spark Master. Please see the logs below and help to resolve this issue. Master Log: 18/06/25 22:59:12 INFO master.Master: akka.tcp://sparkWorker@spark7:7084 got disassociated, removing it.
18/06/25 22:59:12 INFO master.Master: akka.tcp://sparkWorker@spark7:7084 got disassociated, removing it.
18/06/25 22:59:12 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7084] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7079 got disassociated, removing it.
18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7079 got disassociated, removing it.
18/06/25 22:59:28 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7079] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7082 got disassociated, removing it.
18/06/25 22:59:28 INFO master.Master: akka.tcp://sparkWorker@spark7:7082 got disassociated, removing it.
18/06/25 22:59:28 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7082] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 22:59:35 INFO master.Master: akka.tcp://sparkWorker@spark8:7081 got disassociated, removing it.
18/06/25 22:59:35 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark8:7081] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 22:59:35 INFO master.Master: akka.tcp://sparkWorker@spark8:7081 got disassociated, removing it.
18/06/25 23:00:23 INFO master.Master: akka.tcp://sparkWorker@spark9:7081 got disassociated, removing it.
18/06/25 23:00:23 INFO master.Master: akka.tcp://sparkWorker@spark9:7081 got disassociated, removing it.
18/06/25 23:00:23 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark9:7081] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 23:00:45 INFO master.Master: akka.tcp://sparkWorker@spark8:7085 got disassociated, removing it.
18/06/25 23:00:45 INFO master.Master: akka.tcp://sparkWorker@spark8:7085 got disassociated, removing it.
18/06/25 23:00:45 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark8:7085] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 23:00:48 INFO master.Master: akka.tcp://sparkWorker@spark7:7083 got disassociated, removing it.
18/06/25 23:00:48 INFO master.Master: akka.tcp://sparkWorker@spark7:7083 got disassociated, removing it.
18/06/25 23:00:48 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark7:7083] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
18/06/25 23:01:52 INFO master.Master: akka.tcp://sparkWorker@spark0:7080 got disassociated, removing it.
18/06/25 23:01:52 INFO master.Master: akka.tcp://sparkWorker@spark0:7080 got disassociated, removing it.
18/06/25 23:01:52 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkWorker@spark0:7080] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. Worker Log 18/06/25 22:43:56 INFO util.Utils: Successfully started service 'sparkWorker' on port 7081.
18/06/25 22:43:56 INFO worker.Worker: Starting Spark worker HKLPADBID09:7081 with 4 cores, 16.0 GB RAM
18/06/25 22:43:56 INFO worker.Worker: Running Spark version 1.4.1-palantir3
18/06/25 22:43:56 INFO worker.Worker: Spark home: /opt/palantir/spark-1.4.1-palantir3-bin-hadoop2.4
18/06/25 22:43:56 INFO server.Server: jetty-8.y.z-SNAPSHOT
18/06/25 22:43:56 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:8084
18/06/25 22:43:56 INFO util.Utils: Successfully started service 'WorkerUI' on port 8084.
18/06/25 22:43:56 INFO ui.WorkerWebUI: Started WorkerWebUI at http://SPARK:8084
18/06/25 22:43:56 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:44:10 INFO worker.Worker: Retrying connection to master (attempt # 1)
18/06/25 22:44:10 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:44:24 INFO worker.Worker: Retrying connection to master (attempt # 2)
18/06/25 22:44:24 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:44:38 INFO worker.Worker: Retrying connection to master (attempt # 3)
18/06/25 22:44:38 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:44:52 INFO worker.Worker: Retrying connection to master (attempt # 4)
18/06/25 22:44:52 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:45:06 INFO worker.Worker: Retrying connection to master (attempt # 5)
18/06/25 22:45:06 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:45:20 INFO worker.Worker: Retrying connection to master (attempt # 6)
18/06/25 22:45:20 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:46:42 INFO worker.Worker: Retrying connection to master (attempt # 7)
18/06/25 22:46:42 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:48:04 INFO worker.Worker: Retrying connection to master (attempt # 8)
18/06/25 22:48:04 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:49:26 INFO worker.Worker: Retrying connection to master (attempt # 9)
18/06/25 22:49:26 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:50:48 INFO worker.Worker: Retrying connection to master (attempt # 10)
18/06/25 22:50:48 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:52:10 INFO worker.Worker: Retrying connection to master (attempt # 11)
18/06/25 22:52:10 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:53:32 INFO worker.Worker: Retrying connection to master (attempt # 12)
18/06/25 22:53:32 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:54:54 INFO worker.Worker: Retrying connection to master (attempt # 13)
18/06/25 22:54:54 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:56:16 INFO worker.Worker: Retrying connection to master (attempt # 14)
18/06/25 22:56:16 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:57:38 INFO worker.Worker: Retrying connection to master (attempt # 15)
18/06/25 22:57:38 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 22:59:00 INFO worker.Worker: Retrying connection to master (attempt # 16)
18/06/25 22:59:00 INFO worker.Worker: Connecting to master akka.tcp://sparkMaster@spark:7077/user/Master...
18/06/25 23:00:22 ERROR worker.Worker: All masters are unresponsive! Giving up.
18/06/25 23:00:22 INFO util.Utils: Shutdown hook called
... View more
Labels:
- Labels:
-
Apache Spark