Created 07-27-2016 05:06 AM
The tpch-setup.sh has errored out in the middle of data generation and on investigation it is due to name-node going down
Name node shutdown was due to non-availability of quorum.
2016-07-26 08:51:14,250 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [, stream=QuorumOutputStream starting at txid 102237))
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
Should we re-run the tpch-setup again ? If so how — I see there are couple of tables created under tpch_flat_orc_1000 database. Should we drop them and re-run ?
Created 07-31-2016 07:16 AM
So regarding rerunning the failed tpch data generation, drop the tpch database and the corresponding tables and clean the hdfs /tmp/tpch_generator directory.
Once above steps are complete, we can restart the tpch data generator.
Created 07-27-2016 05:10 AM
Could you please check if there are any errors under Journal node logs?
Created 07-27-2016 05:26 AM
I see there error
2016-07-26 08:50:59,086 INFO ipc.Server (Server.java:logException(2401)) - IPC Server handler 2 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.heartbeat from XXXX:56078 Call#193039 Retry#0 java.io.IOException: IPC's epoch 8 is less than the last promised epoch 9 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:158) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) 2016-07-26 08:51:02,529 INFO ipc.Server (Server.java:logException(2401)) - IPC Server handler 4 on 8485, call org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol.heartbeat from XXXX:56078 Call#193040 Retry#0 java.io.IOException: IPC's epoch 8 is less than the last promised epoch 9 at org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:428) at org.apache.hadoop.hdfs.qjournal.server.Journal.heartbeat(Journal.java:417) at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.heartbeat(JournalNodeRpcServer.java:158) at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.heartbeat(QJournalProtocolServerSideTranslatorPB.java:172) at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25423) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) 2016-07-26 08:52:05,793 INFO ipc.Server (Server.java:saslProcess(1538)) - Auth successful for nn/cXXXX@XXXX.VIBGYOR.COM (auth:KERBEROS) 2016-07-26 08:52:05,802 INFO authorize.ServiceAuthorizationManager (ServiceAuthorizationManager.java:authorize(137)) - Authorization successful for nn/cXXXX@XXXX.VIBGYOR.COM (auth:KERBEROS) for protocol=interface org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocol 2016-07-26 08:52:11,368 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102668-102669 took 2215ms 2016-07-26 08:52:21,558 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102670-102677 took 9435ms 2016-07-26 08:52:23,442 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 102678-102714 took 1643ms 2016-07-26 08:53:36,609 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 107956-107956 took 1221ms 2016-07-26 08:54:38,931 WARN server.Journal (Journal.java:journal(398)) - Sync of transaction range 112005-112005 took 2027ms
Created 07-31-2016 07:16 AM
So regarding rerunning the failed tpch data generation, drop the tpch database and the corresponding tables and clean the hdfs /tmp/tpch_generator directory.
Once above steps are complete, we can restart the tpch data generator.