Member since 
    
	
		
		
		01-25-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                119
            
            
                Posts
            
        
                7
            
            
                Kudos Received
            
        
                2
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 16857 | 04-11-2017 12:36 PM | |
| 6173 | 01-18-2017 10:36 AM | 
			
    
	
		
		
		04-11-2017
	
		
		12:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Apparently my previous answer was just a workound.  After the format it looked fine but an alarm definition on Ambari checked for checkpoints and alerted for it.  Alert description was saying that checkpoint was not made for a long time (it alerts after 6 hours by default but was more than 60 hours in my case).  So what happened was; I did format namenodes but QJM was not informed with it. Thus, txid of QJM was still the old value while namenodes had different. As a result QJM did not accept new edits from active namenode, stand by namenode couldn't get any edits from QJM.  After some more searches on google I found below beautiful article:  https://nicholasmaillard.wordpress.com/2015/07/20/formatting-hdfs/  Article describes right steps to format hdfs for HA enabled clusters as below:  "The initial steps are very close  
 Stop the Hdfs service  Start only the journal nodes (as they will need to be made aware of the formatting)  On the first namenode (as user hdfs)
 
  hadoop namenode -format   hdfs namenode -initializeSharedEdits -force (for the journal nodes)  hdfs zkfc -formatZK -force (to force zookeeper to reinitialise)  restart that first namenode    On the second namenode
 
 hdfs namenode -bootstrapStandby -force (force synch with first namenode)    On every datanode clear the data directory  Restart the HDFS service   "  These steps helped me overcome the issue. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2017
	
		
		12:42 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Now, I am able to start all services.  I checked for the namenode files under namenode data directory. ( <dfs.namenode.name.dir> of hdfs-site.xml file). Can also be found on config tab if you are using Ambari.  I checked the file sizes and modified dates of fsimage and edit log files for both namenode hosts.  The command hdfs runs over hdfs client and it runs fine regardless of being on the same host with namenode or not, a.f.a.i.k. Although I know this I did run the command on the host which one of the namenodes installed.  After checking namenode files, I realised that format did not affect the other namenode.  So I tried to run the command also on this node and then started the services.  This way namenodes started. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2017
	
		
		11:14 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It seems like all the numbers are messed up now.  While not sure I think epoch numbers are:  nn1: 21  nn2: 13  JQM: 0  I think I executed format command in a very critical moment.  And yet there are multiple txn id(s)  JQM(s) says:  Client trying to move committed txid backward from 1865764 to 1  One of the namenodes say:  2017-04-07 13:54:05,483 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfin
alizedSegments failed for required journal (JournalAndStream(mgr=QJM to [10.0.109.20:8485, 10.0.109.21:8485, 10.0.109.22:8485]
, stream=null))
java.lang.AssertionError: Decided to synchronize log to startTxId: 1
endTxId: 1
isInProgress: true
 but logger 10.0.109:8485 had seen txid 1865764 committed  I need you guys' help.  Since cluster does not have any data yet, any solution including formatting any data and / or resetting any module is fine. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-06-2017
	
		
		08:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  In my HA cluster, both of my namenodes fail to start.  Centos 7.3  Ambari 2.4.2  HDP 2.5.3  Ambari stderr:  2017-04-06 10:49:49,039 - Getting jmx metrics from NN failed. URL: http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master02.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1>/tmp/tmp0CNZmD 2>/tmp/tmpRAZgwz' returned 7. 
2017-04-06 10:49:51,041 - Getting jmx metrics from NN failed. URL: http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl -s 'http://master03.mydomain.local:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1>/tmp/tmp_hLNY7 2>/tmp/tmpoCOTt8' returned 7. 
...
(tries several times and then)
...
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 420, in <module>
    NameNode().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 101, in start
    upgrade_suspended=params.upgrade_suspended, env=env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 184, in namenode
    if is_this_namenode_active() is False:
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper
    return function(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 562, in is_this_namenode_active
    raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))
resource_management.core.exceptions.Fail: The NameNode nn1 is not listed as Active or Standby, waiting...
  Ambari stdout:  2017-04-06 10:53:20,521 - call returned (255, '17/04/06 10:53:20 INFO ipc.Client: Retrying connect to server: master03.mydomain.local/10.0.109.21:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\n17/04/06 10:53:20 WARN ipc.Client: Failed to connect to server: master03.mydomain.local/10.0.109.21:8020: retries get failed due to exceeded maximum allowed retries number: 1
2017-04-06 10:53:20,522 - No active NameNode was found after 5 retries. Will return current NameNode HA states
  Namenode log:  2017-04-06 10:11:43,561FATALError: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [10.0.109.20:8485, 10.0.109.21:8485, 10.0.109.22:8485], stream=null)) java.lang.AssertionError: Decided to synchronize log to startTxId: 1 endTxId: 1 isInProgress: true but logger 10.0.109.20:8485 had seen txid 1865764 committed at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336) at (some class at some other class at ...)  Some more logs from Namenode(s):  2017-04-05 17:13:42,894 WARN  client.QuorumJournalManager (IPCLoggerChannel.java:call(388)) - Remote journal 10.0.109.22:8
485 failed to write txns 1865765-1865765. Will try to write to this JN again after the next log roll.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): IPC's epoch 13 is not the current writer epoch  0
        at org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:459)
        at org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:351)
        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152)
        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolSer
verSideTranslatorPB.java:158)
2017-04-06 10:11:42,380 INFO  ipc.Server (Server.java:logException(2401)) - IPC Server handler 72 on 8020, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.sendHeartbeat from 9.1.10.14:37173 Call#2322 Retry#0
org.apache.hadoop.ipc.RetriableException: NameNode still not started
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkNNStartup(NameNodeRpcServer.java:2057)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1414)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:118)
        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:29064)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:42,390 INFO  namenode.NameNode (NameNode.java:startCommonServices(876)) - NameNode RPC up at: bigm02.etstur.local/9.1.10.21:8020
2017-04-06 10:11:42,391 INFO  namenode.FSNamesystem (FSNamesystem.java:startStandbyServices(1286)) - Starting services required for standby state
2017-04-06 10:11:42,393 INFO  ha.EditLogTailer (EditLogTailer.java:<init>(117)) - Will roll logs on active node at bigm03.etstur.local/9.1.10.22:8020 every 120 seconds.
2017-04-06 10:11:42,397 INFO  ha.StandbyCheckpointer (StandbyCheckpointer.java:start(129)) - Starting standby checkpoint thread...
Checkpointing active NN at http://bigm03.etstur.local:50070
Serving checkpoints at http://bigm02.etstur.local:50070
2017-04-06 10:11:43,371 INFO  namenode.FSNamesystem (FSNamesystem.java:stopStandbyServices(1329)) - Stopping services started for standby state
2017-04-06 10:11:43,372 WARN  ha.EditLogTailer (EditLogTailer.java:doWork(349)) - Edit log tailer interrupted
java.lang.InterruptedException: sleep interrupted
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:347)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:476)
        at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)
2017-04-06 10:11:43,475 INFO  namenode.FSNamesystem (FSNamesystem.java:startActiveServices(1130)) - Starting services required for active state
2017-04-06 10:11:43,485 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(435)) - Starting recovery process for unclosed journal segments...
2017-04-06 10:11:43,534 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnfinalizedSegments(437)) - Successfully started new epoch 17
2017-04-06 10:11:43,535 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(263)) - Beginning recovery of unclosed segment starting at txid 1
2017-04-06 10:11:43,557 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(272)) - Recovery prepare phase complete. Responses:
9.1.10.20:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
9.1.10.21:8485: segmentState { startTxId: 1 endTxId: 1 isInProgress: true } lastWriterEpoch: 14 lastCommittedTxId: 1865764
2017-04-06 10:11:43,560 INFO  client.QuorumJournalManager (QuorumJournalManager.java:recoverUnclosedSegment(296)) - Using longest log: 9.1.10.20:8485=segmentState {
  startTxId: 1
  endTxId: 1
  isInProgress: true
}
lastWriterEpoch: 14
lastCommittedTxId: 1865764
2017-04-06 10:11:43,561 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [9.1.10.20:8485, 9.1.10.21:8485, 9.1.10.22:8485], stream=null))
java.lang.AssertionError: Decided to synchronize log to startTxId: 1
endTxId: 1
isInProgress: true
 but logger 9.1.10.20:8485 had seen txid 1865764 committed
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:336)
        at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:455)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
        at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
        at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1459)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1139)
        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1915)
        at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
        at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1783)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1631)
        at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
        at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-04-06 10:11:43,562 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2017-04-06 10:11:43,563 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at bigm02.etstur.local/9.1.10.21
************************************************************/
  And although Journal Nodes started succesfully, they have following error which also can be found suspicious:  2017-04-05 17:15:05,653 ERROR RECEIVED SIGNAL 15: SIGTERM  Funny thing is order of execution. There are ZKFailoverController process, JournalNode(s) and Namenode. I always thougth that Namenode is the latest one to start and which of them would be active was already determined by ZKFailoverController process. As I follow the logs, both of the namenodes check each other. What is this for? Even ZKFailoverController wants to reach Namenodes before itself gets started.  And in case you want to know, the backgroud of this error is as following...  Yesterday I noticed that one of the datanodes failed and stopped. There was following errors in the logs:  2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool <registering> (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:596) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1483) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1448) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:319) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:267) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:740) at java.lang.Thread.run(Thread.java:745) 2017-04-05 15:50:11,168 ERROR datanode.DataNode (BPServiceActor.java:run(752)) - Initialization failed for Block pool <registering> (Datanode Uuid be2286f5-00d7-4758-b89a-45e2304cabe3) service to master02.mydomain.local/10.0.109.23:8020. Exiting. org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 13, volumes configured: 14, volumes failed: 1, volume failures tolerated: 0
2017-04-05 17:15:36,968 INFO  common.Storage (Storage.java:tryLock(774)) - Lock on /grid/13/hadoop/hdfs/data/in_use.lock
 acquired by nodename 31353@data02.mydomain.local
  Although seeing volume errors, I was able to browse /grid/13/  So I wanted to try following answers in this stackoverflow question:  http://stackoverflow.com/questions/22316187/datanode-not-starts-correctly  First I deleted data folder under /grid/13/hadoop/hdfs (/grid/13/hadoop/hdfs/data) and tried to start datanode.  It failed again with same errors so I went with namenode format. My cluster was new and empty so I am fine with any solution including formats:  ./hdfs namenode -format -clusterId <myClusterId>  After this format, one of the namenodes failed. I tried to restart all HDFS services then both of namenodes failed.  Any comments appreciated. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		04-04-2017
	
		
		07:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Answer appeared on second comment.  Thanks @Arpit Agarwal !  By the way, port number is 1022  On /etc/hadoop/conf/hdfs-site.xml file      <property>  <name>dfs.datanode.http.address</name>        <value>0.0.0.0:1022</value>  </property>  Yes, my environment is Kerberos enabled.  It was already represented on Namenode UI > Datanodes tab > "Http Address" column. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-03-2017
	
		
		06:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @Arpit Agarwal,  Thank you for your answer.  Actually, you are right. I checked that by a friend's installation and page is there.  Unfortunately that URL is not available in my environment.  This site can’t be reached  10.0.109.23 refused to connect.  
 Search Google for 109 50075   ERR_CONNECTION_REFUSED  I realised an anormal situation on my environment.  I got two datanode instances running (Dproc_datanode) one started by root and other by hdfs.  It should be hdfs so I killed the one started by root. Expected to see the page but it wasn't there. I checked the service on Ambari but it seems like Ambari manages the process started by root : It was stopped.  I need to investigate more what is going on 😕 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-03-2017
	
		
		01:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  Is jmx info pages of datanodes deprecated?  I remember like there used to be pages on http:<datanode>:50070/  I also see similar tips on this page for datanode (50075):  https://www.datadoghq.com/blog/collecting-hadoop-metrics/  On HDP 2.5.3 I can't see them. Do I remember wrong?  Is it something I can enable-disable? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		03-20-2017
	
		
		06:46 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Worked! It is the upper case!  Thank you @Robert Levas! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-16-2017
	
		
		01:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Follow up question...  Reverting AD Realm name seems to be a long process and risky.  Can anybody confirm @Robert Levas ?  Does anybody have comments on this?  The thing is; I didn't see it was mentioned anywhere in guides but see it first time here.  I even read that host names and domain names should be in lower case in a Hortonworks blog.   https://hortonworks.com/blog/enabling-kerberos-hdp-active-directory-integration/  "  DOMAIN_REALM  provides a translation from a domain name or hostname to a Kerberos realm name. The tag name can be a host name, or a domain name, where domain names are indicated by a prefix of a period (.). Host names and domain names should be in lower case. " 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-15-2017
	
		
		08:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Robert Levas thanks for the answer.  I'll try that tomorrow morning (in 12 hours)  There was a setting before starting kerberization:  Under Advanced kerberos-env section:  "Enable case insensitive username rules"  I have checked this one so thougth it would be enough. Isn't it?  If so, does it mean that we cannot connect an HDP cluster to an Active Directory if AD realm was defined with lower case letters? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













