Created on 09-14-2016 09:29 PM - edited 09-16-2022 03:39 AM
Hi,
I have a problem when I'll start my HDFS Service. I have 3 nodes ( 1 Master, 2 Slaves).
All of Secondary Namenode, two Data Nodes is starting successfully, but the NameNode is failed to start.
there's the error messages a stderr
Can't open /var/run/cloudera-scm-agent/process/117-hdfs-NAMENODE/supervisor.conf: Permission denied.
+ make_scripts_executable
+ find /var/run/cloudera-scm-agent/process/117-hdfs-NAMENODE -regex '.*\.\(py\|sh\)$' -exec chmod u+x '{}' ';'
+ '[' DATANODE_MAX_LOCKED_MEMORY '!=' '' ']'
+ ulimit -l
+ export HADOOP_IDENT_STRING=hdfs
+ HADOOP_IDENT_STRING=hdfs
+ '[' -n '' ']'
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = namenode ']'
+ '[' file-operation = namenode ']'
+ '[' bootstrap = namenode ']'
+ '[' failover = namenode ']'
+ '[' transition-to-active = namenode ']'
+ '[' initializeSharedEdits = namenode ']'
+ '[' initialize-znode = namenode ']'
+ '[' format-namenode = namenode ']'
+ '[' monitor-decommission = namenode ']'
+ '[' jnSyncWait = namenode ']'
+ '[' nnRpcWait = namenode ']'
+ '[' -safemode = '' -a get = '' ']'
+ '[' monitor-upgrade = namenode ']'
+ '[' finalize-upgrade = namenode ']'
+ '[' rolling-upgrade-prepare = namenode ']'
+ '[' rolling-upgrade-finalize = namenode ']'
+ '[' nnDnLiveWait = namenode ']'
+ '[' refresh-datanode = namenode ']'
+ '[' mkdir = namenode ']'
+ '[' nfs3 = namenode ']'
+ '[' namenode = namenode -o secondarynamenode = namenode -o datanode = namenode ']'
+ HADOOP_OPTS='-Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ export 'HADOOP_OPTS=-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ HADOOP_OPTS='-Dhdfs.audit.logger=INFO,RFAAUDIT -Dsecurity.audit.logger=INFO,RFAS -Djava.net.preferIPv4Stack=true '
+ '[' namenode = namenode -a rollingUpgrade = '' ']'
+ exec /usr/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/117-hdfs-NAMENODE namenode
and there is the message from stdout
can anybody tell why it's happen ?
Thank you.
Created 09-15-2016 05:17 AM
As per the log
Can't open /var/run/cloudera-scm-agent/process/117-hdfs-NAMENODE/supervisor.conf: Permission denied.
Check directory permissions
Created 09-15-2016 11:54 PM
what it should be ?
I already chaneg the permission but it still happen.
-rw------- 1 root root 2955 Sep 16 13:42 supervisor.con
Created 09-21-2016 09:51 AM
Hello,
If you see stderr output, then the supervisor.conf was already read. The permissions error is not relevant, I think, as the supervisor runs as root and has permission to access. The fact that you see stderr informaiton means the supervisor.conf was already read successfully and the process started as we see the "exec" line.
Check your NameNode log (usually in /var/log/hadoop-hdfs) for details about the failure.
From what you showed us, it appears the agent/supervisor started the NameNode but then it failed to stay running for more than a few seconds at most.
Let us know what you see in the log.
Created 09-22-2016 08:33 PM
2016-09-23 10:33:38,220 INFO org.apache.hadoop.util.GSet: Computing capacity for map BlocksMap
2016-09-23 10:33:38,221 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-09-23 10:33:38,224 INFO org.apache.hadoop.util.GSet: 2.0% max memory 3.9 GB = 80.6 MB
2016-09-23 10:33:38,225 INFO org.apache.hadoop.util.GSet: capacity = 2^23 = 8388608 entries
2016-09-23 10:33:38,554 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: dfs.block.access.token.enable=false
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: defaultReplication = 3
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication = 512
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication = 1
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplicationStreams = 20
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: replicationRecheckInterval = 3000
2016-09-23 10:33:38,558 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: encryptDataTransfer = false
2016-09-23 10:33:38,559 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxNumBlocksToLog = 1000
2016-09-23 10:33:38,572 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = hdfs (auth:SIMPLE)
2016-09-23 10:33:38,573 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = supergroup
2016-09-23 10:33:38,573 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = true
2016-09-23 10:33:38,574 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
2016-09-23 10:33:38,579 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2016-09-23 10:33:38,961 INFO org.apache.hadoop.util.GSet: Computing capacity for map INodeMap
2016-09-23 10:33:38,961 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-09-23 10:33:38,962 INFO org.apache.hadoop.util.GSet: 1.0% max memory 3.9 GB = 40.3 MB
2016-09-23 10:33:38,962 INFO org.apache.hadoop.util.GSet: capacity = 2^22 = 4194304 entries
2016-09-23 10:33:38,975 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2016-09-23 10:33:38,988 INFO org.apache.hadoop.util.GSet: Computing capacity for map cachedBlocks
2016-09-23 10:33:38,989 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-09-23 10:33:38,989 INFO org.apache.hadoop.util.GSet: 0.25% max memory 3.9 GB = 10.1 MB
2016-09-23 10:33:38,989 INFO org.apache.hadoop.util.GSet: capacity = 2^20 = 1048576 entries
2016-09-23 10:33:39,000 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2016-09-23 10:33:39,000 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
2016-09-23 10:33:39,001 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
2016-09-23 10:33:39,007 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2016-09-23 10:33:39,007 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2016-09-23 10:33:39,007 INFO org.apache.hadoop.hdfs.server.namenode.top.metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2016-09-23 10:33:39,010 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache on namenode is enabled
2016-09-23 10:33:39,011 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2016-09-23 10:33:39,015 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache
2016-09-23 10:33:39,015 INFO org.apache.hadoop.util.GSet: VM type = 64-bit
2016-09-23 10:33:39,016 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 3.9 GB = 1.2 MB
2016-09-23 10:33:39,016 INFO org.apache.hadoop.util.GSet: capacity = 2^17 = 131072 entries
2016-09-23 10:33:39,025 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false
2016-09-23 10:33:39,026 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true
2016-09-23 10:33:39,026 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384
2016-09-23 10:33:39,093 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /dfs/nn/in_use.lock acquired by nodename 11727@master1
2016-09-23 10:33:39,098 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.FileNotFoundException: /dfs/nn/current/VERSION (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.apache.hadoop.hdfs.server.common.StorageInfo.readPropertiesFile(StorageInfo.java:245)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.readProperties(NNStorage.java:627)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:337)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1080)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:777)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:613)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:675)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:843)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:822)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
2016-09-23 10:33:39,146 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@master1:50070
2016-09-23 10:33:39,250 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2016-09-23 10:33:39,251 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2016-09-23 10:33:39,251 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2016-09-23 10:33:39,251 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.FileNotFoundException: /dfs/nn/current/VERSION (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.apache.hadoop.hdfs.server.common.StorageInfo.readPropertiesFile(StorageInfo.java:245)
at org.apache.hadoop.hdfs.server.namenode.NNStorage.readProperties(NNStorage.java:627)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:337)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1080)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:777)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:613)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:675)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:843)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:822)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611)
2016-09-23 10:33:39,256 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-09-23 10:33:39,259 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master1/10.5.1.160
************************************************************/
that is the log file in /var/log/hadoop-hdfs/hadoop-cmf-hdfs-NAMENODE-master1.log.out
Created 12-16-2016 08:47 AM
Since the exception is
2016-09-23 10:33:39,098 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.FileNotFoundException: /dfs/nn/current/VERSION (Permission denied)
at java.io.RandomAccessFile.open(Native Method)
the NameNode cannot start due to an inability to load fsimage. fsimage cannot be loaded since there is no VERSION file (hdfs user cannot see it).
I would check permissions on your HDFS local disk directories on the NameNode. To resolve the issue in the exception, make sure that the VERSION file is owned by "hdfs" user... like this:
-rw-r--r-- 1 hdfs hdfs 172 Nov 7 14:37 /dfs/nn/current/VERSION
I hope that is the only issue; fixing this may lead to other issues due to permissions if something happened.
If the owner of the file is shown as a number, that would indicate the OS cannot resolve the file's owner id with a user.
Created 01-17-2017 08:47 PM
I seem to be having the same problem but i am not able to change the file permissions.
drwxr-xr-x 2 root root 32768 Jan 18 04:35 . drwxr-xr-x 3 root root 32768 Jan 16 22:53 .. -rwxr-xr-x 1 root root 321 Jan 16 22:53 fsimage_0000000000000000000 -rwxr-xr-x 1 root root 62 Jan 16 22:53 fsimage_0000000000000000000.md5 -rwxr-xr-x 1 root root 2 Jan 16 22:53 seen_txid -rwxr-xr-x 1 root root 0 Jan 18 04:35 test -rwxr-xr-x 1 root root 203 Jan 16 22:53 VERSION
chown: changing ownership of ‘fsimage_0000000000000000000’: Operation not permitted chown: changing ownership of ‘fsimage_0000000000000000000.md5’: Operation not permitted chown: changing ownership of ‘seen_txid’: Operation not permitted chown: changing ownership of ‘test’: Operation not permitted chown: changing ownership of ‘VERSION’: Operation not permitted [root@hd-master-1 current]#
note that I am root trying this. This is a default download of cloudera manager 5.9.1 install with the basic hadoop package running on centos. no luck.
Created 01-18-2017 12:29 AM
Created 08-07-2017 03:17 AM
Hi
Check the High Availability is configured properly or not.
regards
Shafi
Created 12-16-2016 07:52 AM
Hi,
did you find any solution ? I am facing the same problem.
Thanks