Created on 11-30-2016 02:08 PM - edited 09-16-2022 03:50 AM
Data node is not starting and it is not giving any error logs in logs file.
error logs:-
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 167, in <module> DataNode().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py", line 62, in start datanode(action="start") File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_datanode.py", line 72, in datanode create_log_dir=True File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py", line 267, in service Execute(daemon_cmd, not_if=process_id_exists_command, environment=hadoop_env_exports) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. starting datanode, logging to /data/log/hadoop/hdfs/hadoop-hdfs-datanode-hostname-out
in /var/log/hadoop/hdfs/hadoop-hdfs-datanode.log
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2411) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2550) 2016-05-04 17:42:04,139 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 2016-05-04 17:42:04,140 INFO datanode.DataNode (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at FQDN/IP
When i start the datanode through ambari i dont see any logs in datanode log file.
In /data/log/hadoop/hdfs/hadoop-hdfs-datanode-hostname-out
core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 63785 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 63785 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited /data/log/hadoop/hdfs/hadoop-hdfs-datanode-D-9539.out: line 2: syntax error near unexpected token `(' /data/log/hadoop/hdfs/hadoop-hdfs-datanode-D-9539.out: line 2: `core file size (blocks, -c) unlimited'
Please suggest me.
Mohan.V
Created 11-30-2016 07:31 PM
Can you please try to start Datanode manually (without Ambari) with DEBUG logs?
Here is the command
1. Login to problematic Datanode by 'hdfs' user
2. Run below commands:
#Command1
export HADOOP_ROOT_LOGGER=DEBUG,console
#Command2
hdfs datanode
Note - This will print output on screen and will try to start your Datanode, please do not press 'ctrl+c' until you get ERROR/Exception 🙂
Hope this information helps you to troubleshoot your issue! Happy Hadooping 🙂
Created 12-01-2016 06:47 AM
thanks for the reply kuldeep.
i tried what you have suggested.
I got the following output.
16/12/01 11:27:49 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
16/12/01 11:27:49 INFO datanode.DataNode: Starting DataNode with maxLockedMemory = 0
16/12/01 11:27:49 INFO datanode.DataNode: Opened streaming server at /0.0.0.0:50010
16/12/01 11:27:49 INFO datanode.DataNode: Balancing bandwith is 6250000 bytes/s
16/12/01 11:27:49 INFO datanode.DataNode: Number threads for balancing is 5
16/12/01 11:27:49 INFO datanode.DataNode: Shutdown complete.
16/12/01 11:27:49 FATAL datanode.DataNode: Exception in secureMain
java.io.IOException: the path component: '/' is world-writable. Its permissions are 0777. Please fix this or select a different socket path.
at org.apache.hadoop.net.unix.DomainSocket.validateSocketPathSecurity0(Native Method)
at org.apache.hadoop.net.unix.DomainSocket.bindAndListen(DomainSocket.java:189)
at org.apache.hadoop.hdfs.net.DomainPeerServer.<init>(DomainPeerServer.java:40)
at org.apache.hadoop.hdfs.server.datanode.DataNode.getDomainPeerServer(DataNode.java:965)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:931)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1134)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:430)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2411)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2550)
16/12/01 11:27:49 INFO util.ExitUtil: Exiting with status 1
16/12/01 11:27:49 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at d-9539.kpit.com/10.10.167.160
as i have googled that error,here
http://grokbase.com/t/cloudera/scm-users/143a6q05g6/data-node-failed-to-start
sugested to change the permissions of /(root).
and when i did it still the datanode not started infact now it giving below error.
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. /etc/profile: line 45: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied -bash: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe /usr/hdp/2.3.4.7-4/hadoop/libexec/hadoop-config.sh: line 155: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe starting datanode, logging to /data/log/hadoop/hdfs/hadoop-hdfs-datanode-.out /usr/hdp/2.3.4.7-4//hadoop-hdfs/bin/hdfs.distro: line 30: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe /usr/hdp/2.3.4.7-4/hadoop/libexec/hadoop-config.sh: line 155: /dev/null: Permission denied /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh: line 187: /dev/null: Permission denied
Created 12-01-2016 07:48 AM
i changed the permissions of the files from above by the reference of other cluster.
then agian i troed the command hdfs datanode
i got the follwing error in logs
16/12/01 13:13:22 INFO datanode.DataNode: Shutdown complete. 16/12/01 13:13:22 FATAL datanode.DataNode: Exception in secureMain java.io.IOException: the path component: '/var/lib/hadoop-hdfs' is owned by a user who is not root and not you. Your effective user id is 0; the path is owned by user id 1005, and its permissions are 0751. Please fix this or select a different socket path. at org.apache.hadoop.net.unix.DomainSocket.validateSocketPathSecurity0(Native Method) at org.apache.hadoop.net.unix.DomainSocket.bindAndListen(DomainSocket.java:189) at org.apache.hadoop.hdfs.net.DomainPeerServer.<init>(DomainPeerServer.java:40) at org.apache.hadoop.hdfs.server.datanode.DataNode.getDomainPeerServer(DataNode.java:965) at org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:931) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1134) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:430) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2411) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2345) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2526) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2550) 16/12/01 13:13:22 INFO util.ExitUtil: Exiting with status 1 16/12/01 13:13:22 INFO datanode.DataNode: SHUTDOWN_MSG:
i changed the hadoop-hdfs owner to root, but still getting the same issue.
any suggestions.
Created 12-01-2016 08:00 AM
Few strange things in the out file.
1). Your "/data/log/hadoop/hdfs/hadoop-hdfs-datanode-hostname-out" output shows the following
/data/log/hadoop/hdfs/hadoop-hdfs-datanode-D-9539.out: line 2: syntax error near unexpected token `(' /data/log/hadoop/hdfs/hadoop-hdfs-datanode-D-9539.out: line 2: `core file size (blocks, -c) unlimited'
Which indicates some bad syntax/characters present in the hadoop scripts. Specially "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh" So can you please check the mentioned script on the Host where you were trying to start the DataNode?
2). The permission for the "/var/lib/hadoop-hdfs" should be ideally following with the ownership as "hdfs:hadoop"
drwxr-x--x. 3 hdfs hadoop 4.0K Dec 1 07:44 hadoop-hdfs
3). Regarding the error which you mentioned in the recent comment as "java.io.IOException: the path component: '/' is world-writable. Its permissions are 0777. Please fix this or select a different socket path." its description is available at:
It is to validate that the path chosen for a UNIX domain socket is secure. A socket path is secure if it doesn't allow unprivileged users to perform a man-in-the-middle attack against it. As an example one way to perform a man-in-the-middle attack would be for a malicious user to move the server socket out of the way and create his own socket in the same place.
More info: https://wiki.apache.org/hadoop/SocketPathSecurity
What is needed? So setting the correct permission for the / is needed here.
Please see: http://stackoverflow.com/questions/22300487/filed-to-start-data-node-in-hadoop-cluster
.
Created 12-01-2016 08:28 AM
thanks for the reply jss.
i have tried all what you have suggested already.
but still getting the same issue.
when i start the datanode through ambari ui follwoing error is occured,
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usr/hdp/current/hadoop-client/conf start datanode'' returned 1. /etc/profile: line 45: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied /etc/profile: line 70: /dev/null: Permission denied -bash: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe /usr/hdp/2.3.4.7-4/hadoop/libexec/hadoop-config.sh: line 155: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe starting datanode, logging to /data/log/hadoop/hdfs/hadoop-hdfs-datanode-.out /usr/hdp/2.3.4.7-4//hadoop-hdfs/bin/hdfs.distro: line 30: /dev/null: Permission denied /usr/hdp/current/hadoop-client/conf/hadoop-env.sh: line 100: /dev/null: Permission denied ls: write error: Broken pipe /usr/hdp/2.3.4.7-4/hadoop/libexec/hadoop-config.sh: line 155: /dev/null: Permission denied /usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh: line 187: /dev/null: Permission denied