Support Questions

Find answers, ask questions, and share your expertise

Running in non-interactive mode, and data appears to exist in QJM to [IP:8485]. Not formatting

avatar
Explorer

After Install Kerberos and enable to system, Restart my namenode1 and namenode2 got this error. 

 

17/09/13 11:39:39 INFO namenode.FSNamesystem: fsOwner             = hdfs/namenode1.xx@MOPH.COM (auth:KERBEROS)
17/09/13 11:39:39 INFO namenode.FSNamesystem: supergroup          = root
17/09/13 11:39:39 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/09/13 11:39:39 INFO namenode.FSNamesystem: Determined nameservice ID: nameservice1
17/09/13 11:39:39 INFO namenode.FSNamesystem: HA Enabled: true
17/09/13 11:39:39 INFO namenode.FSNamesystem: Append Enabled: true
17/09/13 11:39:39 INFO util.GSet: Computing capacity for map INodeMap
17/09/13 11:39:39 INFO util.GSet: VM type       = 64-bit
17/09/13 11:39:39 INFO util.GSet: 1.0% max memory 3.9 GB = 39.6 MB
17/09/13 11:39:39 INFO util.GSet: capacity      = 2^22 = 4194304 entries
17/09/13 11:39:39 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? false
17/09/13 11:39:39 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/09/13 11:39:39 INFO util.GSet: Computing capacity for map cachedBlocks
17/09/13 11:39:39 INFO util.GSet: VM type       = 64-bit
17/09/13 11:39:39 INFO util.GSet: 0.25% max memory 3.9 GB = 9.9 MB
17/09/13 11:39:39 INFO util.GSet: capacity      = 2^20 = 1048576 entries
17/09/13 11:39:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/09/13 11:39:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 1
17/09/13 11:39:39 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
17/09/13 11:39:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
17/09/13 11:39:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
17/09/13 11:39:39 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
17/09/13 11:39:39 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/09/13 11:39:39 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/09/13 11:39:39 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/09/13 11:39:39 INFO util.GSet: VM type       = 64-bit
17/09/13 11:39:39 INFO util.GSet: 0.029999999329447746% max memory 3.9 GB = 1.2 MB
17/09/13 11:39:39 INFO util.GSet: capacity      = 2^17 = 131072 entries
17/09/13 11:39:39 INFO namenode.FSNamesystem: ACLs enabled? true
17/09/13 11:39:39 INFO namenode.FSNamesystem: XAttrs enabled? true
17/09/13 11:39:39 INFO namenode.FSNamesystem: Maximum size of an xattr: 16384
Running in non-interactive mode, and data appears to exist in QJM to [172.16.120.31:8485, 172.16.120.32:8485, 172.16.120.46:8485]. Not formatting.
17/09/13 11:39:39 INFO util.ExitUtil: Exiting with status 1
17/09/13 11:39:39 INFO namenode.NameNode: SHUTDOWN_MSG: 

When i'm trying to run it manually with command

 

 exec /usr/lib64/cmf/service/hdfs/hdfs.sh format-namenode cluster14

 

It error as 

 

Wed Sep 13 13:20:59 ICT 2017
Wed Sep 13 13:20:59 ICT 2017
+ source_parcel_environment
+ '[' '!' -z '' ']'
+ locate_cdh_java_home
+ '[' -z /usr/java/jdk1.7.0_67-cloudera ']'
+ verify_java_home
+ '[' -z /usr/java/jdk1.7.0_67-cloudera ']'
+ echo JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
+ . /usr/lib64/cmf/service/common/cdh-default-hadoop
++ [[ -z 5 ]]
++ '[' 5 = 3 ']'
++ '[' 5 = -3 ']'
++ '[' 5 -ge 4 ']'
++ export HADOOP_HOME_WARN_SUPPRESS=true
++ HADOOP_HOME_WARN_SUPPRESS=true
++ export HADOOP_PREFIX=
++ HADOOP_PREFIX=
++ export HADOOP_LIBEXEC_DIR=/libexec
++ HADOOP_LIBEXEC_DIR=/libexec
++ export HADOOP_CONF_DIR=/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format
++ HADOOP_CONF_DIR=/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format
++ export HADOOP_COMMON_HOME=
++ HADOOP_COMMON_HOME=
++ export HADOOP_HDFS_HOME=
++ HADOOP_HDFS_HOME=
++ export HADOOP_MAPRED_HOME=
++ HADOOP_MAPRED_HOME=
++ '[' 5 = 4 ']'
++ '[' 5 = 5 ']'
++ export HADOOP_YARN_HOME=
++ HADOOP_YARN_HOME=
++ replace_pid
++ echo
++ sed 's#{{PID}}#35248#g'
+ export HADOOP_NAMENODE_OPTS=
+ HADOOP_NAMENODE_OPTS=
++ replace_pid
++ echo
++ sed 's#{{PID}}#35248#g'
+ export HADOOP_DATANODE_OPTS=
+ HADOOP_DATANODE_OPTS=
++ replace_pid
++ echo
++ sed 's#{{PID}}#35248#g'
+ export HADOOP_SECONDARYNAMENODE_OPTS=
+ HADOOP_SECONDARYNAMENODE_OPTS=
++ replace_pid
++ echo
++ sed 's#{{PID}}#35248#g'
+ export HADOOP_NFS3_OPTS=
+ HADOOP_NFS3_OPTS=
++ replace_pid
++ echo
++ sed 's#{{PID}}#35248#g'
+ export HADOOP_JOURNALNODE_OPTS=
+ HADOOP_JOURNALNODE_OPTS=
+ '[' 5 -ge 4 ']'
+ HDFS_BIN=/bin/hdfs
+ export 'HADOOP_OPTS=-Djava.net.preferIPv4Stack=true '
+ HADOOP_OPTS='-Djava.net.preferIPv4Stack=true '
+ echo 'using /usr/java/jdk1.7.0_67-cloudera as JAVA_HOME'
using /usr/java/jdk1.7.0_67-cloudera as JAVA_HOME
+ echo 'using 5 as CDH_VERSION'
using 5 as CDH_VERSION
+ echo 'using /var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format as CONF_DIR'
using /var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format as CONF_DIR
+ echo 'using hdfs as SECURE_USER'
using hdfs as SECURE_USER
+ echo 'using hadoop as SECURE_GROUP'
using hadoop as SECURE_GROUP
+ set_hadoop_classpath
+ set_classpath_in_var HADOOP_CLASSPATH
+ '[' -z HADOOP_CLASSPATH ']'
+ [[ -n /usr/share/cmf ]]
++ find /usr/share/cmf/lib/plugins -maxdepth 1 -name '*.jar'
++ tr '\n' :
+ ADD_TO_CP=/usr/share/cmf/lib/plugins/event-publish-5.10.0-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.10.0.jar:
+ [[ -n '' ]]
+ eval 'OLD_VALUE=$HADOOP_CLASSPATH'
++ OLD_VALUE=
+ NEW_VALUE=/usr/share/cmf/lib/plugins/event-publish-5.10.0-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.10.0.jar:
+ export HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-5.10.0-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.10.0.jar
+ HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-5.10.0-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.10.0.jar
+ set -x
+ replace_conf_dir
+ echo CONF_DIR=/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format
CONF_DIR=/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format
+ echo CMF_CONF_DIR=/etc/cloudera-scm-agent
CMF_CONF_DIR=/etc/cloudera-scm-agent
+ EXCLUDE_CMF_FILES=('cloudera-config.sh' 'httpfs.sh' 'hue.sh' 'impala.sh' 'sqoop.sh' 'supervisor.conf' '*.log' '*.keytab' '*jceks')
++ printf '! -name %s ' cloudera-config.sh httpfs.sh hue.sh impala.sh sqoop.sh supervisor.conf '*.log' hdfs.keytab '*jceks'
+ find /var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format -type f '!' -path '/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format/logs/*' '!' -name cloudera-config.sh '!' -name httpfs.sh '!' -name hue.sh '!' -name impala.sh '!' -name sqoop.sh '!' -name supervisor.conf '!' -name '*.log' '!' -name hdfs.keytab '!' -name '*jceks' -exec perl -pi -e 's#{{CMF_CONF_DIR}}#/var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format#g' '{}' ';'
+ make_scripts_executable
+ find /var/run/cloudera-scm-agent/process/5020-hdfs-NAMENODE-format -regex '.*\.\(py\|sh\)$' -exec chmod u+x '{}' ';'
+ '[' DATANODE_MAX_LOCKED_MEMORY '!=' '' ']'
+ ulimit -l
64
+ export HADOOP_IDENT_STRING=hdfs
+ HADOOP_IDENT_STRING=hdfs
+ '[' -n '' ']'
+ '[' mkdir '!=' format-namenode ']'
+ acquire_kerberos_tgt hdfs.keytab
+ '[' -z hdfs.keytab ']'
+ '[' -n '' ']'
+ '[' validate-writable-empty-dirs = format-namenode ']'
+ '[' file-operation = format-namenode ']'
+ '[' bootstrap = format-namenode ']'
+ '[' failover = format-namenode ']'
+ '[' transition-to-active = format-namenode ']'
+ '[' initializeSharedEdits = format-namenode ']'
+ '[' initialize-znode = format-namenode ']'
+ '[' format-namenode = format-namenode ']'
+ '[' -z '' ']'
+ echo 'No storage dirs specified.'
No storage dirs specified.

The configuration hdfs-site.xml looks like this

 

<property>
    <name>dfs.ha.namenodes.nameservice1</name>
    <value>namenode108,namenode123</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir.nameservice1.namenode108</name>
    <value>file:///mnt/disk1/dfs/nn,file:///mnt/disk2/dfs/nn</value>
  </property>
  <property>
    <name>dfs.namenode.shared.edits.dir.nameservice1.namenode108</name>
    <value>qjournal://namenode1.xx:8485;namenode2.xx:8485;service.moph.com:8485/nameservice1</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.nameservice1.namenode108</name>
    <value>namenode1.xx:8020</value>
  </property>
  <property>
    <name>dfs.namenode.servicerpc-address.nameservice1.namenode108</name>
    <value>namenode1.xx:8022</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.nameservice1.namenode108</name>
    <value>namenode1.xx:50070</value>
  </property>
  <property>
    <name>dfs.namenode.https-address.nameservice1.namenode108</name>
    <value>namenode1.xx:50470</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir.nameservice1.namenode123</name>
    <value>file:///mnt/disk1/dfs/nn,file:///mnt/disk2/dfs/nn</value>
  </property>
1 ACCEPTED SOLUTION

avatar
Champion
I think 'No storage dirs specified.' is referencing your dfs.data.dirs. Also, it is possible that the env vars like HADOOP_CONF_DIR are not set correctly for the session you are running that command in.

As for the JN error, it seems that it is trying to format the NN but data already exists in the JN edits directory. Was NN HA working prior to Kerberos being enabled? If you are cool with formatting the NN then you are likely fine with manually removing the data in the JN edits directory. I would back it up in case and then remove it and see if the NN can come online.

Also, did you have NN HA enabled and then disabled it? This is the only time I have seen data already in place in the JN edits directory. Rolling back NN HA in CM does not clear out this data.

View solution in original post

2 REPLIES 2

avatar
Champion
I think 'No storage dirs specified.' is referencing your dfs.data.dirs. Also, it is possible that the env vars like HADOOP_CONF_DIR are not set correctly for the session you are running that command in.

As for the JN error, it seems that it is trying to format the NN but data already exists in the JN edits directory. Was NN HA working prior to Kerberos being enabled? If you are cool with formatting the NN then you are likely fine with manually removing the data in the JN edits directory. I would back it up in case and then remove it and see if the NN can come online.

Also, did you have NN HA enabled and then disabled it? This is the only time I have seen data already in place in the JN edits directory. Rolling back NN HA in CM does not clear out this data.

avatar
Explorer
Thank you very much. You're right. And this turn out to have me start whole cluster from zero again.