Reply
Explorer
Posts: 7
Registered: ‎08-19-2014

Read ahead of logs error while running large insert query

Hi,

 

While running a large INSERT IGNORE INTO query which selects data from a Kudu table and reinserts this after some modification into another Kudu table.


While this query is running we see our logs fill up with a lot of the following error messages:

 

T c0f417f06cf4423cba426b9d4dbf95eb P 7390d2dddae2449db5547c8d4efbd839 [LEADER]: Error trying to read ahead of the log while preparing peer request: Incomplete: Op with index 900 is ahead of the local log (next sequential op: 900). Destination peer: Peer: 801c9bc30b2d4551a3833a5821532c31, Is new: false, Last received: 291.900, Next index: 901, Last known committed idx: 900, Last exchange result: ERROR, Needs tablet copy: false

 

We are unsure where this error is coming from and what the impact is. We are running Kudu 1.0.0-1.kudu1.0.0.p0.6 with Impala_Kudu 2.7.0-1.cdh5.9.0.p0.23 and CDH 5.8.0-1.cdh5.8.0.p0.42. 

 

Can anyone point us in the direction of solving this?

Thank you,


Vincent 

Cloudera Employee
Posts: 6
Registered: ‎10-30-2015

Re: Read ahead of logs error while running large insert query

[ Edited ]

Hi Vincent

 

  This seems an instance of https://issues.apache.org/jira/browse/KUDU-1078.

  Our previous thinking was that this was a transient error that would eventually sort itself, though your particular use case of reading and writing at the same time might make things worse.

 

  A few questions:

  

  Is this actually causing the write to fail? What error do you see on the client side?

  What is the cluster size and rough specs?

  How big are these reads/writes, e.g. are you rewriting the whole output table each time? How big is it?

  How many disks do you have? are they flash? did you specify a different disk for the WAL (i.e. did you pass the --fs_wal_dir flag)?

 

Best

David

  

 

Highlighted
New Contributor
Posts: 1
Registered: ‎09-13-2017

Re: Read ahead of logs error while running large insert query

Hi David,

I already encontered the same issues, the error is as below:

java.sql.SQLException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: ERROR_STATE, SQL state: 
Kudu error(s) reported, first error: Timed out: Failed to write batch of 4660 ops to tablet 853dfb9cec5140b39fdd07eca39571fa after 1 attempt(s): Failed to write to server: d2ba78ce2bf742fab181e481a1b2a9f3 (): Write RPC to timed out after 179.978s (SENT)
, Query: UPSERT INTO leap.EM_SZL2_V5_ORDERQUEUE_KUDU (unix,securityid,tradedate,datatimestamp,lastpx,side,price,qty,numorders,noorders,orderqty01,orderqty02,orderqty03,orderqty04,orderqty05,orderqty06,orderqty07,orderqty08,orderqty09,orderqty10,orderqty11,orderqty12,orderqty13,orderqty14,orderqty15,orderqty16,orderqty17,orderqty18,orderqty19,orderqty20,orderqty21,orderqty22,orderqty23,orderqty24,orderqty25,orderqty26,orderqty27,orderqty28,orderqty29,orderqty30,orderqty31,orderqty32,orderqty33,orderqty34,orderqty35,orderqty36,orderqty37,orderqty38,orderqty39,orderqty40,orderqty41,orderqty42,orderqty43,orderqty44,orderqty45,orderqty46,orderqty47,orderqty48,orderqty49,orderqty50,endofdaymaker) SELECT unix,securityid,tradedate,datatimestamp,lastpx,side,price,qty,numorders,noorders,orderqty01,orderqty02,orderqty03,orderqty04,orderqty05,orderqty06,orderqty07,orderqty08,orderqty09,orderqty10,orderqty11,orderqty12,orderqty13,orderqty14,orderqty15,orderqty16,orderqty17,orderqty18,orderqty19,orderqty20,orderqty21,orderqty22,orderqty23,orderqty24,orderqty25,orderqty26,orderqty27,orderqty28,orderqty29,orderqty30,orderqty31,orderqty32,orderqty33,orderqty34,orderqty35,orderqty36,orderqty37,orderqty38,orderqty39,orderqty40,orderqty41,orderqty42,orderqty43,orderqty44,orderqty45,orderqty46,orderqty47,orderqty48,orderqty49,orderqty50,endofdaymaker FROM leap.EM_SZL2_V5_ORDERQUEUE_HDFS WHERE datemonth='201607' AND TradeDate >= '20160701' AND TradeDate <= '20160731'.
	at com.cloudera.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
	at com.cloudera.hivecommon.api.HS2Client.executeStatement(Unknown Source)
	at com.cloudera.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown Source)
	at com.cloudera.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown Source)
	at com.cloudera.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
	at com.cloudera.jdbc.common.SPreparedStatement.executeUpdate(Unknown Source)
	at com.eastmoney.util.ImpalaManager$.templateUpdate(ImpalaManager.scala:130)
	at com.eastmoney.util.ImpalaManager$.execSimpleSQL(ImpalaManager.scala:20)
	at com.eastmoney.transmit.TransmitProcessor.run(TransmitProcessor.scala:69)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
Caused by: com.cloudera.support.exceptions.GeneralException: [Simba][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: ERROR_STATE, SQL state: 
Kudu error(s) reported, first error: Timed out: Failed to write batch of 4660 ops to tablet 853dfb9cec5140b39fdd07eca39571fa after 1 attempt(s): Failed to write to server: d2ba78ce2bf742fab181e481a1b2a9f3 (): Write RPC to  timed out after 179.978s (SENT)

 

I tried to insert more than 20GB data into it. And in the period of insertion, the tablet leader always shutdown and elected the new one.

 

Besides that, I also found the rowset count of this tablet is more than 600, which option can config it?

 

Wish for you reply as soon as possible.

 

Thanks,

Tony

Announcements