Reply
Highlighted
Explorer
Posts: 10
Registered: ‎02-22-2017

Problem with writing from spark streaming to hbase

[ Edited ]

We have an application that reads messages from specific kafka topics, and process it, and when it reads message from topic it puts offset to the HBase table.

after some amount of working application fails (time varries from 30 minutes to 15 hours ), in the driver stderr we see following log entries:

 

18/04/17 17:31:15 WARN client.AsyncProcess: #3121, the task was rejected by the pool. This is unexpected. Server is ***hostname masked***,60020,1523949367813
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@3f377224 rejected from java.util.concurrent.ThreadPoolExecutor@639d4dae[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:1013)
at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$000(AsyncProcess.java:600)
at org.apache.hadoop.hbase.client.AsyncProcess.submitMultiActions(AsyncProcess.java:449)
at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:429)
at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:238)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:190)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1495)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1098)

 

 

And after some amount of time this ERRORS:

 

18/04/17 17:31:15 ERROR client.AsyncProcess: Cannot get replica 0 location for {"totalColumns":1,"row":"predictor_passport_ru_number_gold","families":{"cf":[{"qualifier":"\\x00\\x00\\x00\\x00","vlen":8,"tag":[],"timestamp":9223372036854775807}]}}
18/04/17 17:31:15 ERROR spark.Utils: Error saving offsets [OffsetRange(topic: 'predictor_passport_ru_number_gold', partition: 0, range: [2536631 -> 2536718])]
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IOException: 1 time,
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:247)
at org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:227)
at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1766)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:240)
at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:190)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1495)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1098)

 

In the HBase logs I see an gap in messages on that period of time:

memstoreflush.PNG

Please help to investigate and solve the issue.

Announcements
New solutions