Created on 02-17-2020 11:08 PM - last edited on 02-17-2020 11:21 PM by VidyaSargur
Hello Team,
I have anabled MIT-Kerberos and integrated my cluster, Initialized the principals for hdfs, hbase and yarn.
Able to access the hdfs and hbase tables.
But when i am trying to run sample mapreduce job its getting failed, Find below error logs.
==> yarn jar /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/hadoop-examples.jar teragen 500000000 /tmp/teragen2
Logs:
WARN security.UserGroupInformation: PriviledgedActionException as:HTTP/hostname.org@FQDN.COM (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=HTTP, access=WRITE, inode="/user":mcaf:supergroup:drwxr-xr-x
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=HTTP, access=WRITE, inode="/user":mcaf:supergroup:drwxr-xr-x
hostname.org:~:HADOOP QA]$ klist
Ticket cache: FILE:/tmp/krb5cc_251473
Default principal: HTTP/hostname.org@FQDN.COM
Valid starting Expires Service principal
02/18/20 01:55:32 02/19/20 01:55:32 krbtgt/FQDN.COM@FQDN.COM
renew until 02/23/20 01:55:32
Can some one please check the issue and help us.
Thanks & Regards,
Vinod
Created 03-26-2020 08:46 AM
Hi @venkatsambath @Shelton ,
As you said I deleted the cache directories and tried again. But still facing the same issue, Find below error logs and directory structures.
Error Logs:
JOB: yarn jar /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/hadoop-examples.jar teragen 500000000 /tmp/teragen22
20/03/26 11:36:53 INFO mapreduce.Job: Job job_1585236349590_0001 failed with state FAILED due to: Application application_1585236349590_0001 failed 2 times due to AM Container for appattempt_1585236349590_0001_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://hostname:8088/proxy/application_1585236349590_0001/Then, click on links to logs of each attempt.
Diagnostics: Application application_1585236349590_0001 initialization failed (exitCode=20) with output: main : command provided 0
main : user is mcaf
main : requested yarn user is mcaf
Failed to create directory /disk1/yarn/nm/usercache/mcaf - No such file or directory
Failed to create directory /disk2/yarn/nm/usercache/mcaf - No such file or directory
Failed to create directory /disk3/yarn/nm/usercache/mcaf - No such file or directory
Failed to create directory /disk4/yarn/nm/usercache/mcaf - No such file or directory
Failed to create directory /disk5/yarn/nm/usercache/mcaf - No such file or directory
Failing this attempt. Failing the application.
20/03/26 11:36:53 INFO mapreduce.Job: Counters: 0
I give a try for my application job and that is also failed and below are the logs,
ERROR 2020Mar26 11:36:00,863 main com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location
org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888) ~[DMXSLoader-0.0.31.jar:0.0.31]
at com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain.hasStagingData(DMXSLoaderMain.java:304) [DMXSLoader-0.0.31.jar:0.0.31]
at com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain.main(DMXSLoaderMain.java:375) [DMXSLoader-0.0.31.jar:0.0.31]
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[?:1.7.0_67]
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[?:1.7.0_67]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[?:1.7.0_67]
at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[?:1.7.0_67]
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) ~[?:1.7.0_67]
at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63) ~[hadoop-common-2.6.0-cdh5.4.7.jar:?]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) ~[hadoop-common-2.6.0-cdh5.4.7.jar:?]
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159) ~[hadoop-common-2.6.0-cdh5.4.7.jar:?]
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117) ~[hadoop-common-2.6.0-cdh5.4.7.jar:?]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.7.0_67]
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.7.0_67]
at java.io.DataOutputStream.flush(DataOutputStream.java:123) ~[?:1.7.0_67]
at org.apache.hadoop.hbase.ipc.IPCUtil.write(IPCUtil.java:246) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.IPCUtil.write(IPCUtil.java:234) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:895) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:850) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1184) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1580) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1294) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299) ~[DMXSLoader-0.0.31.jar:0.0.31]
... 10 more
Directory structure:
hostname]$ sudo ls -ld /disk1/yarn/nm
drwxr-xr-x 4 yarn hadoop 4096 Mar 26 11:37 /disk1/yarn/nm
hostname]$ sudo ls -ld /disk1/yarn/
drwxr-xr-x 3 root root 4096 Sep 28 2015 /disk1/yarn/
hostname]$ sudo ls -ld /disk1
drwxr-xr-x 9 root root 4096 Sep 28 2015 /disk1
Please revert back.
Best Regards,
Vinod
Created 03-26-2020 09:00 PM
These are 2 separate issues
ERROR1:
Did you delete till/disk{1,2,3,4,5}/yarn/nm/usercache/mcaf or you deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/
If you had deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/ then please restart all the nodemanagers.
If not, Can you please let me know How many nodemanagers do you have in this cluster? Can you run
namei -l /disk{1,2,3,4,5}/yarn/nm/usercache/ across all those machines?
Please paste your result with "Insert or code sample" option in the portable so that it will has better readablity
ERROR2:
Mar26 11:36:00,863 main com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location
a. The machine from which you are submitting this job - Does it have hbase gateway installed in it? If not can you run it from a machine which has hbase gateway
b. Also since you said this job worked from hbase user and not mcaf - Have you attempted to grant permission to mcaf to the respective table which you are trying to access? https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hbase_authorization.html#top... has the steps
c. What is the error you see in HMaster logs during the exact timestamp you notice this error in job?
Created 03-26-2020 10:54 PM
Thanks for your response..!
Yes, i have deleted upto usercache and restarted nodemanagers, We installed 10 nodemanagers and find below stats what you asked,
hostname1.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname002.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname003.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname55.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname001.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname003.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname028.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname029.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
hostname054.enterprisenet.org sudo namei -l /disk1/yarn/nm/usercache
f: /disk1/yarn/nm/usercache
dr-xr-xr-x root root /
drwxr-xr-x root root disk1
drwxr-xr-x root root yarn
drwxr-xr-x yarn hadoop nm
drwxr-xr-x yarn yarn usercache
a. Yes, the hbase gate way is available in the same server.
b. Actually its not worked with hbase user.
And i have granted the permissions to mcaf through hbase shell like below,
hbase(main):004:0> grant 'mcaf', 'RWXC' , 'TABLE1'
0 row(s) in 0.6570 seconds
hbase(main):005:0> user_permission 'TABLE1'
User Namespace,Table,Family,Qualifier:Permission
mcaf default,TABLE1,,: [Permission: actions=READ,WRITE,EXEC,CREATE]
1 row(s) in 0.3960 seconds
hbase(main):006:0> grant 'mcaf', 'RWXC' , 'TABLE2'
0 row(s) in 0.5780 seconds
hbase(main):007:0> user_permission 'TABLE2'
User Namespace,Table,Family,Qualifier:Permission
mcaf default,TABLE2,,: [Permission: actions=READ,WRITE,EXEC,CREATE]
1 row(s) in 0.4060 seconds
c. I am not find any error or warn messages in HMaster logs while i am querying.
After granting the permissions i have tested my application job and seeing same error messages like above.
Please revert back.
Best Regards,
Vinod
Created 03-26-2020 11:56 PM
One problem is resolved, I mean to say i am able to run sample yarn job.
Found that one of server having some permission issue and deleted /disk(1,2,3,4,5)/yarn/nm directory and restarted the nodemanager and ran yarn job.
Its worked.
And i tried our application related job and this time getting some different error's,
This job i am running in hostname001 and getting below error and seeing some connection failure to hostname003 server.
ERROR 2020Mar27 02:35:46,850 main com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location
org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135) ~[DMXSLoader-0.0.31.jar:0.0.31]
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888) ~[DMXSLoader-0.0.31.jar:0.0.31]
at com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain.hasStagingData(DMXSLoaderMain.java:304) [DMXSLoader-0.0.31.jar:0.0.31]
at com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain.main(DMXSLoaderMain.java:375) [DMXSLoader-0.0.31.jar:0.0.31]
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to hostname003.enterprisenet.org/10.7.54.13:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/10.7.54.13:60020 is closing. Call id=2, waitTime=12
Regards,
Vinod
Created 03-27-2020 02:15 AM
Can you attach the full exception or the error log - Its unclear what is the actual error with the snippet you pinged in last response
Created 03-27-2020 05:41 AM
Hi @venkatsambath ,
Find below one of our application logs ,
INFO 2020Mar27 01:13:07,422 main com.class.engineering.portfolio.finalresolution.main.MrFinalResolver: Sleeping for >> 300000 ms
ERROR 2020Mar27 01:18:16,485 main com.class.engineering.portfolio.finalresolution.main.MrFinalResolver: Exception occurred while checking for isReadyToRun flag >> Can't get the location
org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the location
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)
at com.class.engineering.portfolio.creditinghadoop.config.ConfigTbl.readValue(ConfigTbl.java:115)
at com.class.engineering.portfolio.creditinghadoop.config.ConfigDao.read(ConfigDao.java:63)
at com.class.engineering.portfolio.finalresolution.main.MrFinalResolver.isReadyToRun(MrFinalResolver.java:344)
at com.class.engineering.portfolio.finalresolution.main.MrFinalResolver.main(MrFinalResolver.java:116)
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to hostname003.enterprisenet.org/1.1.1.1:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/1.1.1.1:60020 is closing. Call id=4045, waitTime=2
at org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1243)
at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1214)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:216)
at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:300)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:31865)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1580)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1294)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1126)
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)
... 12 more
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/1.1.1.1:60020 is closing. Call id=4045, waitTime=2
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.cleanupCalls(RpcClientImpl.java:1033)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.close(RpcClientImpl.java:840)
at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:568)
ERROR 2020Mar27 01:18:25,554 main com.class.engineering.portfolio.resolution.util.HLogUtil: Exception occured while eriting log message >> Failed 1 action: IOException: 1 time,
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IOException: 1 time,
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:227)
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1700(AsyncProcess.java:207)
at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1658)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107)
at com.class.engineering.portfolio.creditinghadoop.log.HLogTbl.write(HLogTbl.java:102)
at com.class.engineering.portfolio.creditinghadoop.log.HLogDao.write(HLogDao.java:50)
at com.class.engineering.portfolio.creditinghadoop.log.HLog.writeMsg(HLog.java:20)
at com.class.engineering.portfolio.resolution.util.HLogUtil.writeMsg(HLogUtil.java:18)
at com.class.engineering.portfolio.finalresolution.main.MrFinalResolver.isReadyToRun(MrFinalResolver.java:358)
at com.class.engineering.portfolio.finalresolution.main.MrFinalResolver.main(MrFinalResolver.java:116)
INFO 2020Mar27 01:18:25,555 main com.class.engineering.portfolio.finalresolution.main.MrFinalResolver: isReady false. Elapsed time: 18130 ms.
INFO 2020Mar27 01:18:25,555 main com.class.engineering.portfolio.finalresolution.main.MrFinalResolver: Sleeping for >> 300000 ms
Please revert back.
Best Regards,
Vinod
Created 03-27-2020 09:13 AM
Hello @venkatsambath @Shelton
Can some one please revert back and that would be great for me.
Best Regards,
Vinod
Created 03-27-2020 09:37 AM
The call to this region server 1.1.1.1:60020 is getting closed instantly
Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException:
Call to hostname003.enterprisenet.org/1.1.1.1:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/1.1.1.1:60020 is closing. Call id=4045, waitTime=2
1. Is there any hbase-site.xml that is bundled with your application jar?
2. If yes, Can you rebuild the jar with latest hbase-site.xml from /etc/hbase/conf/
3. I am not sure if server is printing any ERROR but it will be worth to check, what exactly is happening on RS logs in node hostname003.enterprisenet.org
at the time 2020 Mar 27 01:18:16 (i.e when the connection from client is closed)
Created 03-27-2020 10:13 AM
Hello @Shelton,
There is no logs generated on that time interval.
But i will check for yourquestions with my team and get back to you.
Thanks for responding.
Best Regards,
Vinod
Created 05-29-2020 06:26 AM
Hello @Shelton @venkatsambath
Hope you all doing good.
None of the configuration files are bundled with our jars and they use our latest updated config files in the cluster only.
What i am missing here ?
what could be the reason for failing after enabling Kerberos ?
Please do the needful.
Best Regards,
Vinod