Member since
05-30-2019
86
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1666 | 11-21-2019 10:59 AM |
12-18-2021
02:10 PM
@Sheltoni have tried to do the first step (hdfs dfsadmin -safemode enter) but i am getting the following error: safemode: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
... View more
12-18-2021
01:58 PM
HI @Shelton The thing is from the ambari UI console we have the following view: As you can see both namenode are down is there a way to start them back?
... View more
12-17-2021
08:00 AM
Hi @mike_bronson7 , were you able to solve the issue with those steps? Thank you
... View more
12-17-2021
07:47 AM
Hi, I am trying to restart one of the namenode (nn2) but i get the following error in the logs: 2021-12-17 10:23:53,676 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-17 10:23:53,678 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
2021-12-17 10:23:53,681 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX
************************************************************/ i tryied to do the following steps in order to solve the issue: i copied from nn01 to the NameNode directories of nn02 the following logs edits_0000000000274487928-0000000000274488048 edits_0000000000274488049-0000000000274488109 So far the nn02 is still not starting and i get the same error. Can you please help?
... View more
Labels:
- Labels:
-
HDFS
-
Hortonworks Data Platform (HDP)
12-16-2021
08:09 PM
According to the log namenode2 is not able to start because of that: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109 Any idea on how to fix this?
... View more
12-16-2021
08:05 PM
Could you please give me the steps that need to be follow in case the fs image and edit log are corrup in HA setup (2 namenodes)? I am still note able to start both namenode. Thank you
... View more
12-16-2021
02:34 PM
HI We are currently trying to restart the HDFS name nodes in our cluster (HA). It seems that we are not able to start them. The process of restarting the node takes a long time until it finally fail with the following output on AMBARI: Operation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-12-16 17:02:46,616 - call returned (255, '21/12/16 17:02:46 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\nOperation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused')
2021-12-16 17:02:46,616 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s -k '"'"'https://XX-XXX-XX-XXXX.XXXXX.XX:50470/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpUzuIj9 2>/tmp/tmp8V3Uai''] {'quiet': False}
2021-12-16 17:02:46,684 - call returned (7, '')
2021-12-16 17:02:46,684 - call['hdfs haadmin -ns metrodev -getServiceState nn2'] {'logoutput': True, 'user': 'hdfs'}
21/12/16 17:02:48 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-12-16 17:02:48,615 - call returned (255, '21/12/16 17:02:48 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\nOperation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused')
2021-12-16 17:02:48,615 - NameNode HA states: active_namenodes = [], standby_namenodes = [], unknown_namenodes = [(u'nn1', 'XX-XXX-XX-XXXX.XXXXX.XX:50470'), (u'nn2', 'XX-XXX-XX-XXXX.XXXXX.XX:50470')]
2021-12-16 17:02:48,615 - Will retry 3 time(s), caught exception: No active NameNode was found.. Sleeping for 5 sec(s) On the log i get the following message on the log XXXX-XXXX-XXXX-XX-XXX-XX-XXXXX.XXXX.XX.log 2021-12-16 17:03:57,212 ERROR namenode.EditLogInputStream (EditLogFileInputStream.java:nextOpImpl(192)) - caught exception initializing https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
javax.net.ssl.SSLHandshakeException: Error while authenticating with endpoint: https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:216)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:219)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:426)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:515)
at org.apache.hadoop.security.SecurityUtil.doAsCurrentUser(SecurityUtil.java:509)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.getInputStream(EditLogFileInputStream.java:419)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.init(EditLogFileInputStream.java:139)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:190)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:248)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:08 EST 2021
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:965)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:189)
... 33 more
Caused by: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:08 EST 2021
at sun.security.x509.CertificateValidity.valid(CertificateValidity.java:274)
at sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:629)
at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:201)
at sun.security.validator.Validator.validate(Validator.java:262)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:330)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:237)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:113)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.checkServerTrusted(ReloadingX509TrustManager.java:135)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1099)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
... 44 more
2021-12-16 17:03:57,215 ERROR namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(222)) - Got error reading edit log input stream https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true; failing over to edit log https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
2021-12-16 17:03:57,216 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream 'https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true' to transaction ID 274488049
2021-12-16 17:03:57,223 ERROR namenode.EditLogInputStream (EditLogFileInputStream.java:nextOpImpl(192)) - caught exception initializing https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
javax.net.ssl.SSLHandshakeException: Error while authenticating with endpoint: https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:216)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:219)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:426)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:515)
at org.apache.hadoop.security.SecurityUtil.doAsCurrentUser(SecurityUtil.java:509)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.getInputStream(EditLogFileInputStream.java:419)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.init(EditLogFileInputStream.java:139)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:190)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:248)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:09 EST 2021
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:965)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:189)
... 33 more
Caused by: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:09 EST 2021
at sun.security.x509.CertificateValidity.valid(CertificateValidity.java:274)
at sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:629)
at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:201)
at sun.security.validator.Validator.validate(Validator.java:262)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:330)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:237)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:113)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.checkServerTrusted(ReloadingX509TrustManager.java:135)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1099)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
... 44 more
2021-12-16 17:03:57,224 ERROR namenode.FSImage (FSEditLogLoader.java:loadEditRecords(222)) - Error replaying edit log at offset 0. Expected transaction ID was 274488049
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
2021-12-16 17:03:57,335 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(716)) - Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-16 17:03:57,338 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.w.WebAppContext@44e3a2b2{/,null,UNAVAILABLE}{/hdfs}
2021-12-16 17:03:57,340 INFO server.AbstractConnector (AbstractConnector.java:doStop(318)) - Stopped ServerConnector@2101b44a{SSL,[ssl, http/1.1]}{XX-XXX-XX-XXXX.XXXXX.XX:50470}
2021-12-16 17:03:57,340 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@134d26af{/static,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/static/,UNAVAILABLE}
2021-12-16 17:03:57,341 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@421bba99{/logs,file:///var/log/hadoop/hdfs/,UNAVAILABLE}
2021-12-16 17:03:57,342 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping NameNode metrics system...
2021-12-16 17:03:57,343 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2021-12-16 17:03:57,344 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - NameNode metrics system stopped.
2021-12-16 17:03:57,344 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdown complete.
2021-12-16 17:03:57,344 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-16 17:03:57,345 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
2021-12-16 17:03:57,347 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX
************************************************************/ Could you please help? Thank you
... View more
Labels:
11-18-2021
12:07 PM
Hi, Whenever we try to check the audit on ranger we get the following message and the audit does not appear: After a quick log check in solr server nodes Node1 and 2 have this following message: 2021-11-18 09:53:40,685 WARN [ ] org.apache.solr.handler.admin.MetricsHistoryHandler (MetricsHistoryHandler.java:337) - Unknown format of leader id, skipping: 250349913310626188-xx-xxx-xx-xxxx.xxxxx.xx:8886_solr-n_0000000168 Node 3 has the following message: 2021-11-18T14:57:59,723 [MetricsHistoryHandler-8-thread-1] WARN [ ] org.apache.solr.handler.admin.MetricsHistoryHandler (MetricsHistoryHandler.java:322) - Could not obtain overseer's address, skipping.
org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /overseer_elect/leader
at org.apache.zookeeper.KeeperException.create(KeeperException.java:126) ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0]
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54) ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0]
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1215) ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0]
at org.apache.solr.common.cloud.SolrZkClient.lambda$getData$5(SolrZkClient.java:341) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:341) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at org.apache.solr.client.solrj.impl.ZkDistribStateManager.getData(ZkDistribStateManager.java:84) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at org.apache.solr.client.solrj.cloud.DistribStateManager.getData(DistribStateManager.java:55) ~[solr-solrj-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:14]
at org.apache.solr.handler.admin.MetricsHistoryHandler.getOverseerLeader(MetricsHistoryHandler.java:316) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.admin.MetricsHistoryHandler.amIOverseerLeader(MetricsHistoryHandler.java:349) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.admin.MetricsHistoryHandler.amIOverseerLeader(MetricsHistoryHandler.java:344) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.admin.MetricsHistoryHandler.collectGlobalMetrics(MetricsHistoryHandler.java:440) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.admin.MetricsHistoryHandler.collectMetrics(MetricsHistoryHandler.java:368) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at org.apache.solr.handler.admin.MetricsHistoryHandler.lambda$new$0(MetricsHistoryHandler.java:230) ~[solr-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:55:13]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222] Any idea on what are the steps we need to follow in order to fix this issue? Thank you for your help.
... View more
11-16-2021
12:37 PM
HI, We have a HDFS Capacity Utilization alert on Ambari. The alert says that we have a 81% Disk usage. After check in HDFS we realized that most of the data used comme from the following folders: - /tmp - /user/hive/checkpoints_tmp Could you please give us a clear procedure to clean up those folders without losing any data? Thank you Environnement infos: HDP-3.0.1.0 HDFS 3.1.0 YARN 3.1.0 MapReduce2 3.0.0.3.0 Hive 3.0.0.3.0 HBase 2.0.0.3.0 ZooKeeper 3.4.9.3.0 Ambari Metrics 0.1.0 Atlas 0.7.0.3.0 Kafka 1.0.0.3.0 Knox 0.5.0.3.0 Ranger 1.0.0.3.0 Kerberos 1.10.3-30
... View more
Labels:
10-26-2021
02:15 PM
Hi i have the following Atlas error on ambari UI for some days : and in the atlas logs i can see the following lines: Hi i have the following Atlas error on ambari UI for some days : 2021-10-26 16:47:45,958 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ Initiating re-login for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (KerberosLogin:364)
2021-10-26 16:47:54,277 WARN - [Thread-5:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (Login:330)
2021-10-26 16:47:54,277 WARN - [Thread-5:] ~ No TGT found: will try again at Tue Oct 26 16:48:54 EDT 2021 (Login:136)
2021-10-26 16:47:54,277 INFO - [Thread-5:] ~ TGT refresh sleeping until: Tue Oct 26 16:48:54 EDT 2021 (Login:181)
2021-10-26 16:47:55,958 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (KerberosLogin:332)
2021-10-26 16:47:55,958 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: No TGT found: will try again at Tue Oct 26 16:48:55 EDT 2021 (KerberosLogin:143)
2021-10-26 16:47:55,958 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: TGT refresh sleeping until: Tue Oct 26 16:48:55 EDT 2021 (KerberosLogin:188)
2021-10-26 16:48:02,170 WARN - [Thread-11:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1635281252168 (UserGroupInformation:1221)
2021-10-26 16:48:02,170 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
2021-10-26 16:48:32,171 ERROR - [Thread-11:] ~ Error renewing TGT and relogin. Ignoring Exception, and continuing with the old TGT (MiscUtil:510)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX javax.security.auth.login.LoginException: Cannot locate KDC
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1191)
at org.apache.hadoop.security.UserGroupInformation.relogin(UserGroupInformation.java:1157)
at org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1126)
at org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1058)
at org.apache.ranger.audit.provider.MiscUtil.getUGILoginUser(MiscUtil.java:508)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:106)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: javax.security.auth.login.LoginException: Cannot locate KDC
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1185)
... 8 more
Caused by: KrbException: Cannot locate KDC
at sun.security.krb5.Config.getKDCList(Config.java:1084)
at sun.security.krb5.KdcComm.send(KdcComm.java:218)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
... 21 more
Caused by: KrbException: Generic error (description in e-text) (60) - Unable to locate KDC for realm XXXX.XXXXX.XX
at sun.security.krb5.Config.getKDCFromDNS(Config.java:1181)
at sun.security.krb5.Config.getKDCList(Config.java:1057)
... 26 more
2021-10-26 16:48:32,173 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
2021-10-26 16:48:54,277 INFO - [Thread-5:] ~ Initiating logout for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (Login:389)
2021-10-26 16:48:54,277 INFO - [Thread-5:] ~ Initiating re-login for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (Login:398)
2021-10-26 16:48:54,279 WARN - [Thread-5:] ~ Could not refresh TGT for principal: atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX. Will retry. (Login:235)
javax.security.auth.login.LoginException: Cannot locate KDC
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.zookeeper.Login.reLogin(Login.java:399)
at org.apache.zookeeper.Login.access$400(Login.java:43)
at org.apache.zookeeper.Login$1.run(Login.java:230)
at java.lang.Thread.run(Thread.java:748)
Caused by: KrbException: Cannot locate KDC
at sun.security.krb5.Config.getKDCList(Config.java:1084)
at sun.security.krb5.KdcComm.send(KdcComm.java:218)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
... 15 more
Caused by: KrbException: Generic error (description in e-text) (60) - Unable to locate KDC for realm XXXX.XXXXX.XX
at sun.security.krb5.Config.getKDCFromDNS(Config.java:1181)
at sun.security.krb5.Config.getKDCList(Config.java:1057)
... 20 more
2021-10-26 16:48:55,958 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ Initiating logout for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (KerberosLogin:354)
2021-10-26 16:48:55,958 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ Initiating re-login for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (KerberosLogin:364)
2021-10-26 16:49:02,174 WARN - [Thread-11:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1635281312170 (UserGroupInformation:1221)
2021-10-26 16:49:02,174 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
2021-10-26 16:49:04,279 WARN - [Thread-5:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (Login:330)
2021-10-26 16:49:04,279 WARN - [Thread-5:] ~ No TGT found: will try again at Tue Oct 26 16:50:04 EDT 2021 (Login:136)
2021-10-26 16:49:04,279 INFO - [Thread-5:] ~ TGT refresh sleeping until: Tue Oct 26 16:50:04 EDT 2021 (Login:181)
2021-10-26 16:49:05,959 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (KerberosLogin:332)
2021-10-26 16:49:05,959 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: No TGT found: will try again at Tue Oct 26 16:50:05 EDT 2021 (KerberosLogin:143)
2021-10-26 16:49:05,959 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: TGT refresh sleeping until: Tue Oct 26 16:50:05 EDT 2021 (KerberosLogin:188)
2021-10-26 16:49:32,175 ERROR - [Thread-11:] ~ Error renewing TGT and relogin. Ignoring Exception, and continuing with the old TGT (MiscUtil:510)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX javax.security.auth.login.LoginException: Cannot locate KDC
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1191)
at org.apache.hadoop.security.UserGroupInformation.relogin(UserGroupInformation.java:1157)
at org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1126)
at org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1058)
at org.apache.ranger.audit.provider.MiscUtil.getUGILoginUser(MiscUtil.java:508)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:106)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: javax.security.auth.login.LoginException: Cannot locate KDC
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1185)
... 8 more
Caused by: KrbException: Cannot locate KDC
at sun.security.krb5.Config.getKDCList(Config.java:1084)
at sun.security.krb5.KdcComm.send(KdcComm.java:218)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
... 21 more
Caused by: KrbException: Generic error (description in e-text) (60) - Unable to locate KDC for realm XXXX.XXXXX.XX
at sun.security.krb5.Config.getKDCFromDNS(Config.java:1181)
at sun.security.krb5.Config.getKDCList(Config.java:1057)
... 26 more
2021-10-26 16:49:32,176 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
2021-10-26 16:50:02,176 WARN - [Thread-11:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. Last Login=1635281372174 (UserGroupInformation:1221)
2021-10-26 16:50:02,177 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
2021-10-26 16:50:04,279 INFO - [Thread-5:] ~ Initiating logout for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (Login:389)
2021-10-26 16:50:04,280 INFO - [Thread-5:] ~ Initiating re-login for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (Login:398)
2021-10-26 16:50:04,281 WARN - [Thread-5:] ~ Could not refresh TGT for principal: atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX. Will retry. (Login:235)
javax.security.auth.login.LoginException: Cannot locate KDC
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.zookeeper.Login.reLogin(Login.java:399)
at org.apache.zookeeper.Login.access$400(Login.java:43)
at org.apache.zookeeper.Login$1.run(Login.java:230)
at java.lang.Thread.run(Thread.java:748)
Caused by: KrbException: Cannot locate KDC
at sun.security.krb5.Config.getKDCList(Config.java:1084)
at sun.security.krb5.KdcComm.send(KdcComm.java:218)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
... 15 more
Caused by: KrbException: Generic error (description in e-text) (60) - Unable to locate KDC for realm XXXX.XXXXX.XX
at sun.security.krb5.Config.getKDCFromDNS(Config.java:1181)
at sun.security.krb5.Config.getKDCList(Config.java:1057)
... 20 more
2021-10-26 16:50:05,959 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ Initiating logout for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (KerberosLogin:354)
2021-10-26 16:50:05,959 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ Initiating re-login for atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX (KerberosLogin:364)
^_2021-10-26 16:50:14,282 WARN - [Thread-5:] ~ Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (Login:330)
2021-10-26 16:50:14,282 WARN - [Thread-5:] ~ No TGT found: will try again at Tue Oct 26 16:51:14 EDT 2021 (Login:136)
2021-10-26 16:50:14,282 INFO - [Thread-5:] ~ TGT refresh sleeping until: Tue Oct 26 16:51:14 EDT 2021 (Login:181)
2021-10-26 16:50:15,960 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: Not attempting to re-login since the last re-login was attempted less than 60 seconds before. (KerberosLogin:332)
2021-10-26 16:50:15,960 WARN - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: No TGT found: will try again at Tue Oct 26 16:51:15 EDT 2021 (KerberosLogin:143)
2021-10-26 16:50:15,961 INFO - [kafka-kerberos-refresh-thread-atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX:] ~ [Principal=atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX]: TGT refresh sleeping until: Tue Oct 26 16:51:15 EDT 2021 (KerberosLogin:188)
2021-10-26 16:50:32,177 ERROR - [Thread-11:] ~ Error renewing TGT and relogin. Ignoring Exception, and continuing with the old TGT (MiscUtil:510)
org.apache.hadoop.security.KerberosAuthException: Login failure for user: atlas/xx-xxx-xx-xxxx.xxxxx.xx@XXXX.XXXXX.XX javax.security.auth.login.LoginException: Cannot locate KDC
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1191)
at org.apache.hadoop.security.UserGroupInformation.relogin(UserGroupInformation.java:1157)
at org.apache.hadoop.security.UserGroupInformation.reloginFromKeytab(UserGroupInformation.java:1126)
at org.apache.hadoop.security.UserGroupInformation.checkTGTAndReloginFromKeytab(UserGroupInformation.java:1058)
at org.apache.ranger.audit.provider.MiscUtil.getUGILoginUser(MiscUtil.java:508)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:106)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: javax.security.auth.login.LoginException: Cannot locate KDC
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:804)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.GeneratedMethodAccessor223.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at org.apache.hadoop.security.UserGroupInformation$HadoopLoginContext.login(UserGroupInformation.java:1926)
at org.apache.hadoop.security.UserGroupInformation.unprotectedRelogin(UserGroupInformation.java:1185)
... 8 more
Caused by: KrbException: Cannot locate KDC
at sun.security.krb5.Config.getKDCList(Config.java:1084)
at sun.security.krb5.KdcComm.send(KdcComm.java:218)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbAsReqBuilder.send(KrbAsReqBuilder.java:316)
at sun.security.krb5.KrbAsReqBuilder.action(KrbAsReqBuilder.java:361)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:776)
... 21 more
Caused by: KrbException: Generic error (description in e-text) (60) - Unable to locate KDC for realm XXXX.XXXXX.XX
at sun.security.krb5.Config.getKDCFromDNS(Config.java:1181)
at sun.security.krb5.Config.getKDCList(Config.java:1057)
... 26 more
2021-10-26 16:50:32,178 ERROR - [Thread-11:] ~ Error getting policies for serviceName=xxxxxxxxx_xxxxxresponse=null (RangerAdminRESTClient:167)
com.sun.jersey.api.client.ClientHandlerException: java.net.SocketException: Too many open files
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155)
at com.sun.jersey.api.client.Client.handle(Client.java:652)
at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
at com.sun.jersey.api.client.WebResource$Builder.get(WebResource.java:509)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:120)
at org.apache.ranger.admin.client.RangerAdminRESTClient$3.run(RangerAdminRESTClient.java:113)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710)
at org.apache.ranger.admin.client.RangerAdminRESTClient.getServicePoliciesIfUpdated(RangerAdminRESTClient.java:123)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicyfromPolicyAdmin(PolicyRefresher.java:264)
at org.apache.ranger.plugin.util.PolicyRefresher.loadPolicy(PolicyRefresher.java:202)
at org.apache.ranger.plugin.util.PolicyRefresher.run(PolicyRefresher.java:171)
Caused by: java.net.SocketException: Too many open files
at java.net.Socket.createImpl(Socket.java:460)
at java.net.Socket.connect(Socket.java:587)
at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253)
at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153)
... 13 more
Please advise. Thanks! Koffi Environnement infos: HDP-3.0.1.0 HDFS 3.1.0 YARN 3.1.0 MapReduce2 3.0.0.3.0 Hive 3.0.0.3.0 HBase 2.0.0.3.0 ZooKeeper 3.4.9.3.0 Ambari Metrics 0.1.0 Atlas 0.7.0.3.0 Kafka 1.0.0.3.0 Knox 0.5.0.3.0 Ranger 1.0.0.3.0 Kerberos 1.10.3-30
... View more
Labels: