Member since 
    
	
		
		
		03-11-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                36
            
            
                Posts
            
        
                1
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2066 | 10-06-2016 03:26 PM | 
			
    
	
		
		
		09-16-2020
	
		
		03:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							I believe this will fail if you stop your job today and run it tomorrow.. now will change to other day and you will miss the data...
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-09-2018
	
		
		08:57 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Geoffrey Shelton Okot ..Thanks for the update.  It worked, also want to add one thing that one of my namenode port was occupied by previous running instance[java.net.BindException: Port in use: 0.0.0.0:50070]and  the Ambari   was not showing any message for that , so checked my namenode logs on the server itself.  After killing the old PID and restart did the trick. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-07-2018
	
		
		08:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have installed MIT kerberos on one Linux server and through Ambari's automated way we tried to kerberise our dev cluster.
Amabri created all the principals for each node[3 datanode,2namenode and one edge node] and i can see them in KDC.
While starting all services on last step it failed , Namenode services are not coming up.
Before proceeding this on our dev cluster I have done same activities on Sandbox and it worked.   But on cluster there is a slight change,it is HA cluster and for each node we have two IP's , one is external on which we can do ssh and login and other is internal IP for each node for internal communication through infiniband.  NAMENODE ERROR MSG:-  2018-04-01 16:19:26,580 - call['hdfs haadmin -ns ABCHADOOP01 -getServiceState nn2'] {'logoutput': True, 'user': 'hdfs'}
18/04/01 16:19:28 INFO ipc.Client: Retrying connect to server: c1master02-nn.abc.corp/29.6.6.17:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From c1master01-nn.abc.corp/29.6.6.16 to c1master02-nn.abc.corp:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
2018-04-01 16:19:28,783 - call returned (255, '18/04/01 16:19:28 INFO ipc.Client: Retrying connect to server: c1master02-nn.abc.corp/29.6.6.16:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\nOperation failed: Call From c1master01-nn.abc.corp/29.6.6.16 to c1master02-nn.abc.corp:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused')
2018-04-01 16:19:28,783 - NameNode HA states: active_namenodes = [], standby_namenodes = [], unknown_namenodes = [('nn1', 'c1master01-nn.abc.corp:50070'), ('nn2', 'c1master02-nn.abc.corp:50070')]
2018-04-01 16:19:28,783 - Will retry 2 time(s), caught exception: No active NameNode was found.. Sleeping for 5 sec(s)
2018-04-01 16:19:33,787 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s '"'"'http://c1master01-nn.abc.corp:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpKVcTXy 2>/tmp/tmpy6hgoj''] {'quiet': False}
2018-04-01 16:19:33,837 - call returned (7, '')
2018-04-01 16:19:33,837 - Getting jmx metrics from NN failed. URL: http://c1master01-nn.abc.corp:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/jmx.py", line 38, in get_value_from_jmx
    _, data, _ = get_user_call_output(cmd, user=run_user, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
ExecutionFailed: Execution of 'curl --negotiate -u : -s 'http://c1master01-nn.abc.corp:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem' 1>/tmp/tmpKVcTXy 2>/tmp/tmpy6hgoj' returned 7. 
2018-04-01 16:19:33,837 - call['hdfs haadmin -ns ABCHADOOP01 -getServiceState nn1'] {'logoutput': True, 'user': 'hdfs'}
Command failed after 1 tries
  Do not show this dialog again when starting a background operationOK
Licensed under the Apache License, Version 2.0.
See third-party tools/resources that Ambari uses and their respective authors 
 -From each node i am able to do kadmin and add list princs.
-I have done ssh on Namenode and tried to obtain ticket , it also worked. 
 abc># kinit  -kt /etc/security/keytabs/nn.service.keytab nn/c1master01-nn.abc.corp@ABCHDP.COM
abc># klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: nn/c1master01-nn.abc.corp@ABCHDP.COM
Valid starting     Expires            Service principal
04/01/18 16:03:42  04/02/18 16:03:42  krbtgt/ABCHDP.COM@ABCHDP.COM
        renew until 04/01/18 16:03:42  Since the cluster is empty and tried hadoop namenode -format as well
But got below issue:-  java.io.IOException: Login failure for nn/c1master01-nn.abc.corp@ABCHDP.COM from keytab /etc/security/keytabs/nn.service.keytab: javax.security.auth.login.LoginException: Receive timed out
        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1098)
        at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:307)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1160)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1631)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1769)
Caused by: javax.security.auth.login.LoginException: Receive timed out
        at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:808)
        at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
        at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
        at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:1089)
        ... 4 more
Caused by: java.net.SocketTimeoutException: Receive timed out
        at java.net.PlainDatagramSocketImpl.receive0(Native Method)
        at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
        at java.net.DatagramSocket.receive(DatagramSocket.java:812)
        at sun.security.krb5.internal.UDPClient.receive(NetClient.java:206)
        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:411)
        at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:364)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.security.krb5.KdcComm.send(KdcComm.java:348)
        at sun.security.krb5.KdcComm.sendIfPossible(KdcComm.java:253)
        at sun.security.krb5.KdcComm.send(KdcComm.java:229)
        at sun.security.krb5.KdcComm.send(KdcComm.java:200)
        at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
        at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
        at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
        ... 17 more
18/04/01 15:45:03 INFO util.ExitUtil: Exiting with status 1
18/04/01 15:45:03 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at c1master01-nn.abc.corp/29.6.6.17  This 29.6.6.17 is the internal IP .
Can anybody tell me whats the issue??   Do i need to manually add entry for internal IP's in KDC ??If required why Amabri haven't added it to KDC like it does for external ips??  In case required , since every machine is having only one hostname , why we need two entries?? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Ambari
			
    
	
		
		
		03-05-2018
	
		
		07:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Geoffrey Shelton Okot  Can i use open ldap instead of AD , i mean create users and groups in openldap and use it as backend for Kerberos??  Is it good practice? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-20-2017
	
		
		06:58 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Rajesh...  Thanks ...It is working for beeline.  Since it is a bug in Knox ,  can we upgrade from KNOX 0.9 to Knox 0.12 for HDP2.5.3 ? is there any document for that as i was not able to find any doc for upgrading knox ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-19-2017
	
		
		08:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We are querying HS2 using knox through beeline and also other jdbc tool and getting frequent disconnection.
Below is the url for connection thorugh beeline:-  jdbc:hive2://c3master03-nn.abc.org:8445/;ssl=true?hive.server2.transport.mode=http;hive.server2.thrift.http.path=gateway/default/hive
  After connection if i do not query for a minute then i got below error(Same with Squirrel JDBC client as well)  Getting log thread is interrupted, since query is done!
Error: org.apache.thrift.transport.TTransportException: org.apache.http.NoHttpResponseException: c3master03-nn.abc.org:8445 failed to respond (state=08S01,code=0)
java.sql.SQLException: org.apache.thrift.transport.TTransportException: org.apache.http.NoHttpResponseException: c3master03-nn.abc.org:8445 failed to respond
        at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:305)
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:238)
        at org.apache.hive.beeline.Commands.execute(Commands.java:863)
        at org.apache.hive.beeline.Commands.sql(Commands.java:728)
        at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:993)
        at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:833)
        at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:791)
        at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:491)
        at org.apache.hive.beeline.BeeLine.main(BeeLine.java:474)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.thrift.transport.TTransportException: org.apache.http.NoHttpResponseException: c3master03-nn.abc.org:8445 failed to respond
        at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:297)
        at org.apache.thrift.transport.THttpClient.flush(THttpClient.java:313)
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:73)
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.send_ExecuteStatement(TCLIService.java:223)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.ExecuteStatement(TCLIService.java:215)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1363)
        at com.sun.proxy.$Proxy0.ExecuteStatement(Unknown Source)
        at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:296)
        ... 14 more
Caused by: org.apache.http.NoHttpResponseException: c3master03-nn.abc.org:8445 failed to respond
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:84)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:117)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at org.apache.thrift.transport.THttpClient.flushUsingHttpClient(THttpClient.java:251)
        ... 26 more  Even After this exception , if i rerun the query on same beeline window , it will get executed and show me the result.  And after the wait for 1 minute if i execute the same query or any other query , same exception for one time and on rerun result is there...What is this weird behavior..  Even the below properties have sufficient values.  
 hive.server2.session.check.interval  hive.server2.idle.operation.timeout  hive.server2.idle.session.timeout   Can some one help what is the issue or configuration changes required... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
- 
						
							
		
			Apache Knox
			
    
	
		
		
		02-22-2017
	
		
		07:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @spolavarapu..I did lit bit googling an fixed it..but in ranger  while creating policy i have selected one LDAP group ..so idealy only the users of these group should come in 'Select User' tab..but i can see all users there..     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-22-2017
	
		
		02:12 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Vipin Rathor..the above search was fine...i believe in HDP 2.3.2 groupsync filters was false by default..which was the issue.. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-22-2017
	
		
		02:10 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @spolavarapu Thanks ..when i enabled filters from false to true..it picked the groups, but all of them are internal..also i have downloaded 2.5 sandbox , i was able to get the groups as these filters were already enabled there but there i am not able to login using the passwords in HDP 2.5 . It says invalid username/password .Can you give quick pointers to check for that.. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-20-2017
	
		
		02:39 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have hdp 2.3.2 and ranger 0.5 and openldap . I am intgrating ldap and ranger .
I have configured ranger and able to see the users in users tab but my groups are not visible in
the RangerUI . Below in the LDIF from open ldap.  Sample LDIF
# bigdatdomain.com
dn: dc=bigdatdomain,dc=com
objectClass: organization
objectClass: dcObject
o: Hadoop
dc: bigdatdomain
# users, bigdatdomain.com
dn: ou=users,dc=bigdatdomain,dc=com
objectClass: organizationalUnit
ou: users
# student1, users, bigdatdomain.com
dn: uid=student1,ou=users,dc=bigdatdomain,dc=com
uid: student1
cn: student1
sn: 1
objectClass: top
objectClass: posixAccount
objectClass: inetOrgPerson
loginShell: /bin/bash
homeDirectory: /home/student1
uidNumber: 15000
gidNumber: 10000
userPassword:: e1NTSEF9Q1FHNUtIYzZiMWlpK3FvcGFWQ3NOYTE0djkrcjE0cjU=
mail: student1@bigdatdomain.com
gecos: Student1 User
# student2, users, bigdatdomain.com
dn: uid=student2,ou=users,dc=bigdatdomain,dc=com
uid: student2
cn: student2
sn: 2
objectClass: top
objectClass: posixAccount
objectClass: inetOrgPerson
loginShell: /bin/bash
homeDirectory: /home/student2
uidNumber: 15001
gidNumber: 10000
userPassword:: e1NTSEF9Q1FHNUtIYzZiMWlpK3FvcGFWQ3NOYTE0djkrcjE0cjU=
mail: student2@bigdatdomain.com
gecos: Student2 User
# groups, bigdatdomain.com
dn: ou=groups,dc=bigdatdomain,dc=com
objectClass: top
objectClass: organizationalUnit
ou: groups
description: stc groups
# itpeople, groups, bigdatdomain.com
dn: cn=itpeople,ou=groups,dc=bigdatdomain,dc=com
objectClass: groupOfNames
member: uid=student2,ou=users,dc=bigdatdomain,dc=com
member: uid=student1,ou=users,dc=bigdatdomain,dc=com
cn: itpeople
description: IT security group  Usersync log :-  20 Feb 2017 00:00:55  INFO LdapUserGroupBuilder [UnixUserSyncThread] - LdapUserGroupBuilder initialization completed with --  ldapUrl: ldap://xyz:389,  ldapBindDn: cn=Manager,dc=bigdatdomain,dc=com,  ldapBindPassword: ***** ,  ldapAuthenticationMechanism: simple,  searchBase: dc=bigdatdomain,dc=com,  userSearchBase: ou=users,dc=bigdatdomain,dc=com,  userSearchScope: 2,  userObjectClass: person,  userSearchFilter: uid=*,  extendedUserSearchFilter: (&(objectclass=person)(uid=*)),  userNameAttribute: uid,  userSearchAttributes: [uid, ismemberof, memberof],  userGroupNameAttributeSet: [ismemberof, memberof],  pagedResultsEnabled: true,  pagedResultsSize: 500,  groupSearchEnabled: false,  groupSearchBase: dc=bigdatdomain,dc=com,  groupSearchScope: 2,  groupObjectClass: groupofnames,  groupSearchFilter: ,  extendedGroupSearchFilter: (&(objectclass=groupofnames)(member={0})),  extendedAllGroupsSearchFilter: (&(objectclass=groupofnames)),  groupMemberAttributeName: member,  groupNameAttribute: cn,  groupUserMapSyncEnabled: false,  ldapReferral: ignore  Can some point that if there is any error in my ranger conf?? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Ranger
 
        













