Member since
07-21-2023
5
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
585 | 05-15-2024 07:08 AM |
05-15-2024
07:08 AM
1 Kudo
Hello @Scharan, I resolved this by setting a number of oozie jobs to a status of "FAILED" directly in the oozie database. They were rogue jobs at a status of RUNNING in the database. Thanks for your input and replies! Ian.
... View more
04-23-2024
08:54 AM
Hello @Scharan , thanks again for the lead. I'll look into this and report back. Thanks, Ian.
... View more
04-22-2024
03:46 AM
1 Kudo
Hello @Scharan, thanks for the advice. I've tried setting that property in the jobs.properties file. After the service restart I'm no longer getting the E0606. However, the PurgeXCommand error remains and the service still fails to stay up. I'm beginning to suspect that I have an incorrect component version installed as it looks like PurgeXCommand is the issue. The error is being raised by openjpa-2.2.2-r422266. I'm wondering if this version is incompatible with Cloudera Express 5.6.0.
... View more
04-17-2024
05:31 AM
Hi All, After starting up a Cluster that has been shutdown for some time I find that one of the Oozie servers will not stay up. It initially starts then goes down. There are two errors reported in the log file. I assume that one of them is responsible but I'm not sure which. Any help would be appreciated. The errors are: First Error: Source: PurgeXCommand Message: <openjpa-2.2.2-r422266:1468616 fatal general error> org.apache.openjpa.persistence.PersistenceException: The column index is out of range: 3, number of columns: 2. FailedObject: select w.id, w.parentId from WorkflowJobBean w where w.endTimestamp < :endTime and w.parentId like '%C@%' [java.lang.String] Second Error: Source: CoordStatusTransitXCommand Message: SERVER[<redacted>] USER[<redacted>] GROUP[-] TOKEN[] APP[<redacted>] JOB[0000068-201217031325641-oozie-oozi-C] ACTION[-] XException, org.apache.oozie.command.CommandException: E0606: Could not get lock [coord_status_transit_b60ff85c-fe55-4bf5-8003-ee73935ca076], timed out [0]ms Any guidance would be appreciated. Best regards, Ian.
... View more
Labels:
- Labels:
-
Apache Oozie
-
Cloudera Manager
07-21-2023
04:49 AM
Hello All, I'm trouble-shooting the following issue with our Cloudera Nutch cluster and would appreciate any help the community can offer: We have two NameNode roles and three JournalNode roles running, however both NameNode roles are failing to start and reporting the error below (IP addresses obfuscated). This occurred following a restart of the underlying hosts. Any recommendations for a recovery path from this error would be greatly appreciated. Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [x.x.x.95:8485, x.x.x.86:8485, x.x.x.130:8485], stream=null))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 1 successful responses:
x.x.x.130:8485: null [success]
2 exceptions thrown:
10.103.28.95:8485: tried to access method com.google.common.collect.Range.<init>(Lcom/google/common/collect/Cut;Lcom/google/common/collect/Cut;)V from class com.google.common.collect.Ranges
at com.google.common.collect.Ranges.create(Ranges.java:76)
at com.google.common.collect.Ranges.closed(Ranges.java:98)
at org.apache.hadoop.hdfs.qjournal.server.Journal.txnRange(Journal.java:872)
at org.apache.hadoop.hdfs.qjournal.server.Journal.acceptRecovery(Journal.java:806)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.acceptRecovery(JournalNodeRpcServer.java:206)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.acceptRecovery(QJournalProtocolServerSideTranslatorPB.java:261)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25435)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
10.103.28.86:8485: tried to access method com.google.common.collect.Range.<init>(Lcom/google/common/collect/Cut;Lcom/google/common/collect/Cut;)V from class com.google.common.collect.Ranges
at com.google.common.collect.Ranges.create(Ranges.java:76)
at com.google.common.collect.Ranges.closed(Ranges.java:98)
at org.apache.hadoop.hdfs.qjournal.server.Journal.txnRange(Journal.java:872)
at org.apache.hadoop.hdfs.qjournal.server.Journal.acceptRecovery(Journal.java:806)
at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.acceptRecovery(JournalNodeRpcServer.java:206)
at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.acceptRecovery(QJournalProtocolServerSideTranslatorPB.java:261)
at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25435)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnclosedSegment(QuorumJournalManager.java:345)
at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:455)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1408)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1201)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1717)
at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)
at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)
at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1590)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1351)
at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
... View more
Labels:
- Labels:
-
Cloudera Manager
-
HDFS