Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to Start DataNode

avatar

Hi Guys,

I''m unable to start DataNode in ambari 2.1.2. I tried all the suggested solutions in the community and Internet and without any success to solve it.

Here the important error files from

var/lib/ambari-agent/data/
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Florin Miron

Run the below command to check if JCE is Unlimited.To avoid several Kerberos issues, Java Cryptography Extension (JCE) should be unlimited. If JCE is unlimited, then 'local_policy.jar' should contain keyword 'CryptoAllPermission'

$ zipgrep CryptoAllPermission $JAVA_HOME/jre/lib/security/local_policy.jar 

In my case the ouput is as below which is expected default_local.policy: permission javax.crypto.CryptoAllPermission

$ zipgrep CryptoAllPermission /usr/jdk64/jdk1.8.0_112/jre/lib/security/local_policy.jar 
default_local.policy: permission javax.crypto.CryptoAllPermission;

How to Install JCE

  1. Go to the Oracle Java SE download page http://www.oracle.com/technetwork/java/javase/downloads/index.html
  2. Scroll down ... Under "Additional Resources" section you will find "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File"
  3. Download the version that matches your installed JVM E.g. UnlimitedJCEPolicyJDK7.zip
  4. Unzip the downloaded zip
  5. Copy local_policy.jar and US_export_policy.jar to the $JAVA_HOME/jre/lib/security (Note: these jars will be already there so you have to overwrite them)
  6. Then restart your application to get rid of this exception.

Then try to restart the datanode.

View solution in original post

5 REPLIES 5

avatar
Master Mentor

@Florin Miron

Yeah that's quite disappointing the files you have attached are not exhaustive, can you upload the below files.

/var/log/hadoop/hdfs/hadoop-hdfs-datanode-node.bigdata.com.log 
/var/log/ambari-server/ambari-server.log

This could help in the diagnostics

avatar

These are the files:

avatar
Master Mentor

@Florin Miron

Run the below command to check if JCE is Unlimited.To avoid several Kerberos issues, Java Cryptography Extension (JCE) should be unlimited. If JCE is unlimited, then 'local_policy.jar' should contain keyword 'CryptoAllPermission'

$ zipgrep CryptoAllPermission $JAVA_HOME/jre/lib/security/local_policy.jar 

In my case the ouput is as below which is expected default_local.policy: permission javax.crypto.CryptoAllPermission

$ zipgrep CryptoAllPermission /usr/jdk64/jdk1.8.0_112/jre/lib/security/local_policy.jar 
default_local.policy: permission javax.crypto.CryptoAllPermission;

How to Install JCE

  1. Go to the Oracle Java SE download page http://www.oracle.com/technetwork/java/javase/downloads/index.html
  2. Scroll down ... Under "Additional Resources" section you will find "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy File"
  3. Download the version that matches your installed JVM E.g. UnlimitedJCEPolicyJDK7.zip
  4. Unzip the downloaded zip
  5. Copy local_policy.jar and US_export_policy.jar to the $JAVA_HOME/jre/lib/security (Note: these jars will be already there so you have to overwrite them)
  6. Then restart your application to get rid of this exception.

Then try to restart the datanode.

avatar

Thanks, it works start datanode, but now namenode unable to start.Here is error file from

/var/log/hadoop/hdfs/hadoop-hdfs-datanode-node.bigdata.com.log 

2017-10-30 12:38:08,034 INFO  namenode.NameNode (LogAdapter.java:info(47)) - registered UNIX signal handlers for [TERM, HUP, INT]
2017-10-30 12:38:08,040 INFO  namenode.NameNode (NameNode.java:createNameNode(1559)) - createNameNode []
2017-10-30 12:38:12,036 INFO  impl.MetricsConfig (MetricsConfig.java:loadFirst(112)) - loaded properties from hadoop-metrics2.properties
2017-10-30 12:38:12,433 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(63)) - Initializing Timeline metrics sink.
2017-10-30 12:38:12,435 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(81)) - Identified hostname = node.bigdata.com, serviceName = namenode
2017-10-30 12:38:12,605 INFO  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:init(93)) - Collector Uri: http://node.bigdata.com:6188/ws/v1/timeline/metrics
2017-10-30 12:38:12,630 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:start(206)) - Sink timeline started
2017-10-30 12:38:12,894 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(375)) - Scheduled snapshot period at 60 second(s).
2017-10-30 12:38:12,894 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:start(192)) - NameNode metrics system started
2017-10-30 12:38:12,899 INFO  namenode.NameNode (NameNode.java:setClientNamenodeAddress(424)) - fs.defaultFS is hdfs://node.bigdata.com:8020
2017-10-30 12:38:12,899 INFO  namenode.NameNode (NameNode.java:setClientNamenodeAddress(444)) - Clients are to use node.bigdata.com:8020 to access this namenode/service.
2017-10-30 12:38:16,007 INFO  util.JvmPauseMonitor (JvmPauseMonitor.java:run(179)) - Starting JVM pause monitor
2017-10-30 12:38:16,132 INFO  hdfs.DFSUtil (DFSUtil.java:httpServerTemplateForNNAndJN(1762)) - Starting Web-server for hdfs at: http://node.bigdata.com:50070
2017-10-30 12:38:16,478 INFO  mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2017-10-30 12:38:16,542 INFO  server.AuthenticationFilter (AuthenticationFilter.java:constructSecretProvider(294)) - Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2017-10-30 12:38:16,591 INFO  http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.namenode is not defined
2017-10-30 12:38:16,597 INFO  http.HttpServer2 (HttpServer2.java:addGlobalFilter(710)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2017-10-30 12:38:16,599 INFO  http.HttpServer2 (HttpServer2.java:addFilter(685)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context hdfs
2017-10-30 12:38:16,600 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2017-10-30 12:38:16,600 INFO  http.HttpServer2 (HttpServer2.java:addFilter(693)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2017-10-30 12:38:16,600 INFO  security.HttpCrossOriginFilterInitializer (HttpCrossOriginFilterInitializer.java:initFilter(49)) - CORS filter not enabled. Please set hadoop.http.cross-origin.enabled to 'true' to enable it
2017-10-30 12:38:16,768 INFO  http.HttpServer2 (NameNodeHttpServer.java:initWebHdfs(89)) - Added filter 'org.apache.hadoop.hdfs.web.AuthFilter' (class=org.apache.hadoop.hdfs.web.AuthFilter)
2017-10-30 12:38:16,771 INFO  http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(609)) - addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.namenode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2017-10-30 12:38:16,947 INFO  http.HttpServer2 (HttpServer2.java:openListeners(915)) - Jetty bound to port 50070
2017-10-30 12:38:16,947 INFO  mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26.hwx
2017-10-30 12:38:20,069 WARN  mortbay.log (Slf4jLog.java:warn(76)) - Can't reuse /tmp/Jetty_node_bigdata_com_50070_hdfs____s7vqly, using /tmp/Jetty_node_bigdata_com_50070_hdfs____s7vqly_645589693448351738
2017-10-30 12:38:20,789 INFO  mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup@node.bigdata.com:50070
2017-10-30 12:38:20,901 WARN  common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2017-10-30 12:38:20,902 WARN  common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2017-10-30 12:38:20,902 WARN  namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(654)) - Only one image storage directory (dfs.namenode.name.dir) configured. Beware of data loss due to lack of redundant storage directories!
2017-10-30 12:38:20,902 WARN  namenode.FSNamesystem (FSNamesystem.java:checkConfiguration(659)) - Only one namespace edits storage directory (dfs.namenode.edits.dir) configured. Beware of data loss due to lack of redundant storage directories!
2017-10-30 12:38:20,917 WARN  common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2017-10-30 12:38:20,918 WARN  common.Util (Util.java:stringAsURI(56)) - Path /hadoop/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
2017-10-30 12:38:23,756 WARN  common.Storage (NNStorage.java:setRestoreFailedStorage(210)) - set restore failed storage to true
2017-10-30 12:38:23,915 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(723)) - No KeyProvider found.
2017-10-30 12:38:23,916 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(729)) - Enabling async auditlog
2017-10-30 12:38:23,918 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(733)) - fsLock is fair:false
2017-10-30 12:38:24,302 INFO  blockmanagement.HeartbeatManager (HeartbeatManager.java:<init>(90)) - Setting heartbeat recheck interval to 30000 since dfs.namenode.stale.datanode.interval is less than dfs.namenode.heartbeat.recheck-interval
2017-10-30 12:38:24,332 INFO  blockmanagement.DatanodeManager (DatanodeManager.java:<init>(239)) - dfs.block.invalidate.limit=1000
2017-10-30 12:38:24,332 INFO  blockmanagement.DatanodeManager (DatanodeManager.java:<init>(245)) - dfs.namenode.datanode.registration.ip-hostname-check=true
2017-10-30 12:38:24,334 INFO  blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(71)) - dfs.namenode.startup.delay.block.deletion.sec is set to 000:01:00:00.000
2017-10-30 12:38:24,335 INFO  blockmanagement.BlockManager (InvalidateBlocks.java:printBlockDeletionTime(76)) - The block deletion will start around 2017 Oct 30 13:38:24
2017-10-30 12:38:24,337 INFO  util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlocksMap
2017-10-30 12:38:24,337 INFO  util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type       = 64-bit
2017-10-30 12:38:24,355 INFO  util.GSet (LightWeightGSet.java:computeCapacity(356)) - 2.0% max memory 1011.3 MB = 20.2 MB
2017-10-30 12:38:24,355 INFO  util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^21 = 2097152 entries
2017-10-30 12:38:24,367 INFO  blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(388)) - dfs.block.access.token.enable=true
2017-10-30 12:38:24,367 INFO  blockmanagement.BlockManager (BlockManager.java:createBlockTokenSecretManager(409)) - dfs.block.access.key.update.interval=600 min(s), dfs.block.access.token.lifetime=600 min(s), dfs.encrypt.data.transfer.algorithm=null
2017-10-30 12:38:27,858 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(374)) - defaultReplication         = 3
2017-10-30 12:38:27,858 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(375)) - maxReplication             = 50
2017-10-30 12:38:27,858 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(376)) - minReplication             = 1
2017-10-30 12:38:27,858 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(377)) - maxReplicationStreams      = 2
2017-10-30 12:38:27,858 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(378)) - replicationRecheckInterval = 3000
2017-10-30 12:38:27,859 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(379)) - encryptDataTransfer        = false
2017-10-30 12:38:27,859 INFO  blockmanagement.BlockManager (BlockManager.java:<init>(380)) - maxNumBlocksToLog          = 1000
2017-10-30 12:38:27,865 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(753)) - fsOwner             = hdfs (auth:SIMPLE)
2017-10-30 12:38:27,865 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(754)) - supergroup          = hdfs
2017-10-30 12:38:27,865 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(755)) - isPermissionEnabled = true
2017-10-30 12:38:27,866 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(766)) - HA Enabled: false
2017-10-30 12:38:27,868 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(803)) - Append Enabled: true
2017-10-30 12:38:27,903 INFO  util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map INodeMap
2017-10-30 12:38:27,904 INFO  util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type       = 64-bit
2017-10-30 12:38:27,904 INFO  util.GSet (LightWeightGSet.java:computeCapacity(356)) - 1.0% max memory 1011.3 MB = 10.1 MB
2017-10-30 12:38:27,904 INFO  util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^20 = 1048576 entries
2017-10-30 12:38:27,905 INFO  namenode.FSDirectory (FSDirectory.java:<init>(247)) - ACLs enabled? false
2017-10-30 12:38:27,906 INFO  namenode.FSDirectory (FSDirectory.java:<init>(251)) - XAttrs enabled? true
2017-10-30 12:38:27,906 INFO  namenode.FSDirectory (FSDirectory.java:<init>(259)) - Maximum size of an xattr: 16384
2017-10-30 12:38:27,906 INFO  namenode.NameNode (FSDirectory.java:<init>(312)) - Caching file names occuring more than 10 times
2017-10-30 12:38:27,914 INFO  util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map cachedBlocks
2017-10-30 12:38:27,914 INFO  util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type       = 64-bit
2017-10-30 12:38:27,914 INFO  util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.25% max memory 1011.3 MB = 2.5 MB
2017-10-30 12:38:27,914 INFO  util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^18 = 262144 entries
2017-10-30 12:38:27,916 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(5306)) - dfs.namenode.safemode.threshold-pct = 1.0
2017-10-30 12:38:27,917 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(5307)) - dfs.namenode.safemode.min.datanodes = 0
2017-10-30 12:38:27,917 INFO  namenode.FSNamesystem (FSNamesystem.java:<init>(5308)) - dfs.namenode.safemode.extension     = 30000
2017-10-30 12:38:27,921 INFO  metrics.TopMetrics (TopMetrics.java:logConf(65)) - NNTop conf: dfs.namenode.top.window.num.buckets = 10
2017-10-30 12:38:27,921 INFO  metrics.TopMetrics (TopMetrics.java:logConf(67)) - NNTop conf: dfs.namenode.top.num.users = 10
2017-10-30 12:38:27,921 INFO  metrics.TopMetrics (TopMetrics.java:logConf(69)) - NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2017-10-30 12:38:27,923 INFO  namenode.FSNamesystem (FSNamesystem.java:initRetryCache(913)) - Retry cache on namenode is enabled
2017-10-30 12:38:27,924 INFO  namenode.FSNamesystem (FSNamesystem.java:initRetryCache(921)) - Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2017-10-30 12:38:27,965 INFO  util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map NameNodeRetryCache
2017-10-30 12:38:27,965 INFO  util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type       = 64-bit
2017-10-30 12:38:27,966 INFO  util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.029999999329447746% max memory 1011.3 MB = 310.7 KB
2017-10-30 12:38:27,966 INFO  util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity      = 2^15 = 32768 entries
2017-10-30 12:38:28,002 WARN  namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(690)) - Encountered exception loading fsimage
java.io.FileNotFoundException: /hadoop/hdfs/namenode/in_use.lock (Permission denied)
    at java.io.RandomAccessFile.open0(Native Method)
    at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:706)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:678)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:322)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
2017-10-30 12:38:28,009 INFO  mortbay.log (Slf4jLog.java:info(67)) - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@node.bigdata.com:50070
2017-10-30 12:38:28,109 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping NameNode metrics system...
2017-10-30 12:38:28,110 INFO  impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2017-10-30 12:38:28,110 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - NameNode metrics system stopped.
2017-10-30 12:38:28,111 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(605)) - NameNode metrics system shutdown complete.
2017-10-30 12:38:28,111 ERROR namenode.NameNode (NameNode.java:main(1712)) - Failed to start namenode.
java.io.FileNotFoundException: /hadoop/hdfs/namenode/in_use.lock (Permission denied)
    at java.io.RandomAccessFile.open0(Native Method)
    at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
    at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:706)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:678)
    at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:499)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:322)
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:215)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:951)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:935)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
2017-10-30 12:38:28,112 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2017-10-30 12:38:28,192 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node.bigdata.com/192.1

avatar
Master Mentor

@Florin Miron

This error might arise if you do not start namenode as hdfs user. Actually, you are trying to run it as root user but the file /hadoop/hdfs/namenode/in_use.lock is owned by hdfs user. Do not use sudo to start hadoop processess. Try to start it without sudo. Try this:

su -l hdfs -c "/usr/hdp/current/hadoop-hdfs-namenode/../hadoop/sbin/hadoop-daemon.sh start namenode"

Can you try that and revert