Member since
05-30-2019
86
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2871 | 11-21-2019 10:59 AM |
12-16-2021
02:34 PM
HI We are currently trying to restart the HDFS name nodes in our cluster (HA). It seems that we are not able to start them. The process of restarting the node takes a long time until it finally fail with the following output on AMBARI: Operation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-12-16 17:02:46,616 - call returned (255, '21/12/16 17:02:46 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\nOperation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused')
2021-12-16 17:02:46,616 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl --negotiate -u : -s -k '"'"'https://XX-XXX-XX-XXXX.XXXXX.XX:50470/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem'"'"' 1>/tmp/tmpUzuIj9 2>/tmp/tmp8V3Uai''] {'quiet': False}
2021-12-16 17:02:46,684 - call returned (7, '')
2021-12-16 17:02:46,684 - call['hdfs haadmin -ns metrodev -getServiceState nn2'] {'logoutput': True, 'user': 'hdfs'}
21/12/16 17:02:48 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
Operation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2021-12-16 17:02:48,615 - call returned (255, '21/12/16 17:02:48 INFO ipc.Client: Retrying connect to server: XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)\nOperation failed: Call From XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX to XX-XXX-XX-XXXX.XXXXX.XX:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused')
2021-12-16 17:02:48,615 - NameNode HA states: active_namenodes = [], standby_namenodes = [], unknown_namenodes = [(u'nn1', 'XX-XXX-XX-XXXX.XXXXX.XX:50470'), (u'nn2', 'XX-XXX-XX-XXXX.XXXXX.XX:50470')]
2021-12-16 17:02:48,615 - Will retry 3 time(s), caught exception: No active NameNode was found.. Sleeping for 5 sec(s) On the log i get the following message on the log XXXX-XXXX-XXXX-XX-XXX-XX-XXXXX.XXXX.XX.log 2021-12-16 17:03:57,212 ERROR namenode.EditLogInputStream (EditLogFileInputStream.java:nextOpImpl(192)) - caught exception initializing https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
javax.net.ssl.SSLHandshakeException: Error while authenticating with endpoint: https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:216)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:219)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:426)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:515)
at org.apache.hadoop.security.SecurityUtil.doAsCurrentUser(SecurityUtil.java:509)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.getInputStream(EditLogFileInputStream.java:419)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.init(EditLogFileInputStream.java:139)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:190)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:248)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:08 EST 2021
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:965)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:189)
... 33 more
Caused by: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:08 EST 2021
at sun.security.x509.CertificateValidity.valid(CertificateValidity.java:274)
at sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:629)
at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:201)
at sun.security.validator.Validator.validate(Validator.java:262)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:330)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:237)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:113)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.checkServerTrusted(ReloadingX509TrustManager.java:135)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1099)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
... 44 more
2021-12-16 17:03:57,215 ERROR namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(222)) - Got error reading edit log input stream https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true; failing over to edit log https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
2021-12-16 17:03:57,216 INFO namenode.RedundantEditLogInputStream (RedundantEditLogInputStream.java:nextOp(177)) - Fast-forwarding stream 'https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true' to transaction ID 274488049
2021-12-16 17:03:57,223 ERROR namenode.EditLogInputStream (EditLogFileInputStream.java:nextOpImpl(192)) - caught exception initializing https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
javax.net.ssl.SSLHandshakeException: Error while authenticating with endpoint: https://XX-XXX-XX-XXXX.XXXXX.XX:8481/getJournal?jid=metrodev&segmentTxId=274488049&storageInfo=-64%3A1482798275%3A1538749182266%3ACID-4128f9aa-86b4-4add-9c9a-38c3b06c7384&inProgressOk=true
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:216)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:348)
at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:219)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:426)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog$1.run(EditLogFileInputStream.java:420)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:515)
at org.apache.hadoop.security.SecurityUtil.doAsCurrentUser(SecurityUtil.java:509)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream$URLLog.getInputStream(EditLogFileInputStream.java:419)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.init(EditLogFileInputStream.java:139)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOpImpl(EditLogFileInputStream.java:190)
at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.nextOp(EditLogFileInputStream.java:248)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:09 EST 2021
at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1946)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:316)
at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:310)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1639)
at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:223)
at sun.security.ssl.Handshaker.processLoop(Handshaker.java:1037)
at sun.security.ssl.Handshaker.process_record(Handshaker.java:965)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1064)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1395)
at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1379)
at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:167)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:189)
... 33 more
Caused by: java.security.cert.CertificateExpiredException: NotAfter: Thu Dec 16 15:58:09 EST 2021
at sun.security.x509.CertificateValidity.valid(CertificateValidity.java:274)
at sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:629)
at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:201)
at sun.security.validator.Validator.validate(Validator.java:262)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:330)
at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:237)
at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:113)
at org.apache.hadoop.security.ssl.ReloadingX509TrustManager.checkServerTrusted(ReloadingX509TrustManager.java:135)
at sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1099)
at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1621)
... 44 more
2021-12-16 17:03:57,224 ERROR namenode.FSImage (FSEditLogLoader.java:loadEditRecords(222)) - Error replaying edit log at offset 0. Expected transaction ID was 274488049
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
2021-12-16 17:03:57,335 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(716)) - Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-16 17:03:57,338 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.w.WebAppContext@44e3a2b2{/,null,UNAVAILABLE}{/hdfs}
2021-12-16 17:03:57,340 INFO server.AbstractConnector (AbstractConnector.java:doStop(318)) - Stopped ServerConnector@2101b44a{SSL,[ssl, http/1.1]}{XX-XXX-XX-XXXX.XXXXX.XX:50470}
2021-12-16 17:03:57,340 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@134d26af{/static,file:///usr/hdp/3.0.1.0-187/hadoop-hdfs/webapps/static/,UNAVAILABLE}
2021-12-16 17:03:57,341 INFO handler.ContextHandler (ContextHandler.java:doStop(910)) - Stopped o.e.j.s.ServletContextHandler@421bba99{/logs,file:///var/log/hadoop/hdfs/,UNAVAILABLE}
2021-12-16 17:03:57,342 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210)) - Stopping NameNode metrics system...
2021-12-16 17:03:57,343 INFO impl.MetricsSinkAdapter (MetricsSinkAdapter.java:publishMetricsFromQueue(141)) - timeline thread interrupted.
2021-12-16 17:03:57,344 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216)) - NameNode metrics system stopped.
2021-12-16 17:03:57,344 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607)) - NameNode metrics system shutdown complete.
2021-12-16 17:03:57,344 ERROR namenode.NameNode (NameNode.java:main(1715)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:160)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:890)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 274488048; expected file to go up to 274488109
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213)
... 12 more
2021-12-16 17:03:57,345 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0. Expected transaction ID was 274488049
2021-12-16 17:03:57,347 INFO namenode.NameNode (LogAdapter.java:info(51)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at XX-XXX-XX-XXXX.XXXXX.XX/XX.X.XX.XX
************************************************************/ Could you please help? Thank you
... View more
Labels:
11-16-2021
12:37 PM
HI, We have a HDFS Capacity Utilization alert on Ambari. The alert says that we have a 81% Disk usage. After check in HDFS we realized that most of the data used comme from the following folders: - /tmp - /user/hive/checkpoints_tmp Could you please give us a clear procedure to clean up those folders without losing any data? Thank you Environnement infos: HDP-3.0.1.0 HDFS 3.1.0 YARN 3.1.0 MapReduce2 3.0.0.3.0 Hive 3.0.0.3.0 HBase 2.0.0.3.0 ZooKeeper 3.4.9.3.0 Ambari Metrics 0.1.0 Atlas 0.7.0.3.0 Kafka 1.0.0.3.0 Knox 0.5.0.3.0 Ranger 1.0.0.3.0 Kerberos 1.10.3-30
... View more
Labels:
09-04-2020
08:24 AM
Using: HDP 3.0.1 HDFS 3.1.0 NAMENODE HEAP: 84.8% 3.3 GB / 4.0 GB DISK USAGE (DFS USED): 71.20% 63.9 TB / 89.7 TB DISK USAGE (NON DFS USED): 0.74% 676.2 GB / 89.7 TB DISK REMAINING: 28.06% 25.2 TB / 89.7 TB Block Size : 128 MB Any idea how to solve to reduce reduce the name node heap size usage? Thank you
... View more
Labels:
08-19-2020
08:05 AM
Hi, I am currently facing issue with NiFi 1.7.0 (2 nodes), My flows had been running fine since months. Couple of weeks ago we did a downgrading of the nifi machines. We went from CPU 16 RAM 128G to CPU 4 RAM 32G. After the dowgrading we changed parameters to make sure that nifi take into consideration the new size of the machines Initial values: Initial memory allocation: 80G Max memory allocation: 112g New values: Initial memory allocation: 20G Max memory allocation: 26G within nifi we did also change in NIFI Genaral Settings from Maximum timer driven thread count: 64 Maximum event driven thread count: 16 to Maximum timer driven thread count: 2 Maximum event driven thread count: 4 Now some time when we try to load we get an enexpected error occured with 2020-08-18 10:57:51,942 ERROR [NiFi logging handler] org.apache.nifi.StdErr OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x0000000780000000, 218103808, 0) failed; error='Cannot allocate memory' (errno=12) 2020-08-18 10:57:51,956 INFO [NiFi logging handler] org.apache.nifi.StdOut # There is insufficient memory for the Java Runtime Environment to continue. 2020-08-18 10:57:51,957 INFO [NiFi logging handler] org.apache.nifi.StdOut # Native memory allocation (mmap) failed to map 218103808 byt Could you please help me with this issue.
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
-
Apache NiFi
03-17-2020
11:57 AM
Hi,
I have a hadoop environment that has 2 nifi nodes running. I am not able to connect to the nifi UI since yesterday and i try do restart one of the node without any success....
In in nifi-app.log, file seeing this as only Error level error:
2020-03-17 13:42:37,974 ERROR [main] o.a.n.c.c.node.NodeClusterCoordinator Event Reported for ***-nf02.****.ca:9091 -- Node disconnected from cluster due to org.apache.nifi.controller.UninheritableFlowException: Failed to connect node to cluster because local flow is different than cluster flow.
Could please help?
... View more
Labels:
- Labels:
-
Apache NiFi
12-30-2019
02:55 PM
Hi, I Have a hadoop cluster 3.0.1 with 3 journalnodes, 1 nfsgateways node and 6 workernodes. I connected by ssh to the worker nodes today and realised by doing a "df -h" that one a the one local disk (/data/4) is around 94% used on every worker nodes whereas the others disk are between 50% and 65%... The HDFS status on the another hand is the following: Disk Usage (DFS Used) 44.77% 28.1 TB / 62.8 TB Disk Usage (Non DFS Used) 14.97% 9.4 TB / 62.8 TB Disk Remaining 40.26% 25.3 TB / 62.8 TB What are the the elements i should check to make sure that a full local disk won't create any issue?
... View more
Labels:
11-21-2019
10:59 AM
Hi @Shelton when you said authorizations.xml are you talking about authorizers.xml? the hadoop environment use ranger for the securioty and also is connected to a ldap server for the users and groups. I don't see any users.xml in the conf directory.
... View more
11-21-2019
09:08 AM
Hi, We are currently using HDP 3.0 with ambari and we installed 2 nifi nodes. We made some config changes on nifi node01 without restarting both nodes (i only restarted the node01 and not node02). The changes were not working properly so we decided to roll back to the previous configs but whenever i try to start node01 i am getting the following error: Failed to connect node to cluster because local flow is different than cluster flow. My guess would be that both nodes are out of synch.... How can we fix this issue? Thank you for your help.
... View more
Labels:
09-03-2019
01:28 PM
hi @nshawa, I am having the following error on PutHiveStreaming processor after running the template you provided: Any idea how to fix this?
... View more
06-27-2019
12:51 AM
HI I have exported a table from a hadoop envrionment using the following command: export table department to 'hdfs_exports_location/department'; I tried to import the same table into another hadoop environment using the command: import from 'hdfs_exports_location/department'; i get the following error: Error: Error while compiling statement: FAILED: SemanticException [Error 10027]: Invalid path (state=42000,code=10027) i tried using import table imported_dept from 'hdfs_exports_location/department'; i get the following error: Error: Error while compiling statement: FAILED: SemanticException [Error 10324]: Import Semantic Analyzer Error (state=42000,code=10324) Any idea what could be the issue? i am using hive 3.1.0.3.0.1.0-187. Thank you
... View more
Labels:
- Labels:
-
Apache Hive
- « Previous
- Next »