Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

how to rerun failed tpch data generation

avatar

The tpch-setup.sh has errored out in the middle of data generation and on investigation it is due to name-node going down

Name node shutdown was due to non-availability of quorum.

2016-07-26 08:51:14,250 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [, stream=QuorumOutputStream starting at txid 102237))

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:

Should we re-run the tpch-setup again ? If so how — I see there are couple of tables created under tpch_flat_orc_1000 database. Should we drop them and re-run ?

1 ACCEPTED SOLUTION

avatar

So regarding rerunning the failed tpch data generation, drop the tpch database and the corresponding tables and clean the hdfs /tmp/tpch_generator directory.

Once above steps are complete, we can restart the tpch data generator.

View solution in original post

3 REPLIES 3

avatar
@sgowda

Could you please check if there are any errors under Journal node logs?

avatar

I see there error

2016-07-26 08:50:59,086 INFO ipc.Server (Server.java:logException(2401)) - IPC Server handler 2 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.heartbeat from XXXX:56078 Call#193039 Retry#0 java.io.IOException: IPC's epoch 8 is less than the last promised epoch 9 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:158) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) 2016-07-26 08:51:02,529 INFO ipc.Server (Server.java:logException(2401)) - IPC Server handler 4 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.heartbeat from XXXX:56078 Call#193040 Retry#0 java.io.IOException: IPC's epoch 8 is less than the last promised epoch 9 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:158) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) 2016-07-26 08:52:05,793 INFO ipc.Server (Server.java:saslProcess(1538)) - Auth successful for nn/cXXXX@XXXX.VIBGYOR.COM (auth:KERBEROS) 2016-07-26 08:52:05,802 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for nn/cXXXX@XXXX.VIBGYOR.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol 2016-07-26 08:52:11,368 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102668-102669 took 2215ms 2016-07-26 08:52:21,558 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102670-102677 took 9435ms 2016-07-26 08:52:23,442 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102678-102714 took 1643ms 2016-07-26 08:53:36,609 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 107956-107956 took 1221ms 2016-07-26 08:54:38,931 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 112005-112005 took 2027ms

avatar

So regarding rerunning the failed tpch data generation, drop the tpch database and the corresponding tables and clean the hdfs /tmp/tpch_generator directory.

Once above steps are complete, we can restart the tpch data generator.