Reply
Explorer
Posts: 7
Registered: ‎08-19-2014

Read ahead of logs error while running large insert query

Hi,

 

While running a large INSERT IGNORE INTO query which selects data from a Kudu table and reinserts this after some modification into another Kudu table.


While this query is running we see our logs fill up with a lot of the following error messages:

 

T c0f417f06cf4423cba426b9d4dbf95eb P 7390d2dddae2449db5547c8d4efbd839 [LEADER]: Error trying to read ahead of the log while preparing peer request: Incomplete: Op with index 900 is ahead of the local log (next sequential op: 900). Destination peer: Peer: 801c9bc30b2d4551a3833a5821532c31, Is new: false, Last received: 291.900, Next index: 901, Last known committed idx: 900, Last exchange result: ERROR, Needs tablet copy: false

 

We are unsure where this error is coming from and what the impact is. We are running Kudu 1.0.0-1.kudu1.0.0.p0.6 with Impala_Kudu 2.7.0-1.cdh5.9.0.p0.23 and CDH 5.8.0-1.cdh5.8.0.p0.42. 

 

Can anyone point us in the direction of solving this?

Thank you,


Vincent 

Cloudera Employee
Posts: 6
Registered: ‎10-30-2015

Re: Read ahead of logs error while running large insert query

[ Edited ]

Hi Vincent

 

  This seems an instance of https://issues.apache.org/jira/browse/KUDU-1078.

  Our previous thinking was that this was a transient error that would eventually sort itself, though your particular use case of reading and writing at the same time might make things worse.

 

  A few questions:

  

  Is this actually causing the write to fail? What error do you see on the client side?

  What is the cluster size and rough specs?

  How big are these reads/writes, e.g. are you rewriting the whole output table each time? How big is it?

  How many disks do you have? are they flash? did you specify a different disk for the WAL (i.e. did you pass the --fs_wal_dir flag)?

 

Best

David

  

 

New Contributor
Posts: 1
Registered: ‎09-13-2017

Re: Read ahead of logs error while running large insert query

Hi David,

I already encontered the same issues, the error is as below:

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: ERROR_STATE, SQL state: 
Kudu error(s) reported, first error: Timed out: Failed to write batch of 4660 ops to tablet 853dfb9cec5140b39fdd07eca39571fa after 1 attempt(s): Failed to write to server: d2ba78ce2bf742fab181e481a1b2a9f3 (): Write RPC to timed out after 179.978s (SENT)
, Query: UPSERT INTO leap.EM_SZL2_V5_ORDERQUEUE_KUDU (unix,securityid,tradedate,datatimestamp,lastpx,side,price,qty,numorders,noorders,orderqty01,orderqty02,orderqty03,orderqty04,orderqty05,orderqty06,orderqty07,orderqty08,orderqty09,orderqty10,orderqty11,orderqty12,orderqty13,orderqty14,orderqty15,orderqty16,orderqty17,orderqty18,orderqty19,orderqty20,orderqty21,orderqty22,orderqty23,orderqty24,orderqty25,orderqty26,orderqty27,orderqty28,orderqty29,orderqty30,orderqty31,orderqty32,orderqty33,orderqty34,orderqty35,orderqty36,orderqty37,orderqty38,orderqty39,orderqty40,orderqty41,orderqty42,orderqty43,orderqty44,orderqty45,orderqty46,orderqty47,orderqty48,orderqty49,orderqty50,endofdaymaker) SELECT unix,securityid,tradedate,datatimestamp,lastpx,side,price,qty,numorders,noorders,orderqty01,orderqty02,orderqty03,orderqty04,orderqty05,orderqty06,orderqty07,orderqty08,orderqty09,orderqty10,orderqty11,orderqty12,orderqty13,orderqty14,orderqty15,orderqty16,orderqty17,orderqty18,orderqty19,orderqty20,orderqty21,orderqty22,orderqty23,orderqty24,orderqty25,orderqty26,orderqty27,orderqty28,orderqty29,orderqty30,orderqty31,orderqty32,orderqty33,orderqty34,orderqty35,orderqty36,orderqty37,orderqty38,orderqty39,orderqty40,orderqty41,orderqty42,orderqty43,orderqty44,orderqty45,orderqty46,orderqty47,orderqty48,orderqty49,orderqty50,endofdaymaker FROM leap.EM_SZL2_V5_ORDERQUEUE_HDFS WHERE datemonth='201607' AND TradeDate >= '20160701' AND TradeDate <= '20160731'.
	at com.cloudera.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
	at com.cloudera.hivecommon.api.HS2Client.executeStatement(Unknown Source)
	at com.cloudera.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown Source)
	at com.cloudera.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown Source)
	at com.cloudera.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
	at com.cloudera.jdbc.common.SPreparedStatement.executeUpdate(Unknown Source)
	at com.eastmoney.util.ImpalaManager$.templateUpdate(ImpalaManager.scala:130)
	at com.eastmoney.util.ImpalaManager$.execSimpleSQL(ImpalaManager.scala:20)
	at com.eastmoney.transmit.TransmitProcessor.run(TransmitProcessor.scala:69)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Caused by: com.cloudera.support.exceptions.GeneralException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: ERROR_STATE, SQL state: 
Kudu error(s) reported, first error: Timed out: Failed to write batch of 4660 ops to tablet 853dfb9cec5140b39fdd07eca39571fa after 1 attempt(s): Failed to write to server: d2ba78ce2bf742fab181e481a1b2a9f3 (): Write RPC to  timed out after 179.978s (SENT)

 

I tried to insert more than 20GB data into it. And in the period of insertion, the tablet leader always shutdown and elected the new one.

 

Besides that, I also found the rowset count of this tablet is more than 600, which option can config it?

 

Wish for you reply as soon as possible.

 

Thanks,

Tony

Announcements
Unanswered Topics
No posts to display.