Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

DataFrame.collect or show throwing java.lang.IllegalArgumentException while reading data from hbase

DataFrame.collect or show throwing java.lang.IllegalArgumentException while reading data from hbase

Here is my java code that reads data from a hbase table:

 

SQLContext sqlContext = new SQLContext(jsc);

HashMap<String, String> colMap = new HashMap<String, String>();
            colMap.put("hbase.columns.mapping", "KEY_FIELD STRING :key" + ", rowCount DOUBLE p:rowCount, diskSize DOUBLE p:diskSize, refreshTime STRING r:lastRefreshTime" );
            colMap.put("hbase.table", hbaseTableName);

DataFrame df = sqlContext.read().format("org.apache.hadoop.hbase.spark").options(colMap).load();

df.registerTempTable("hbasedata");
DataFrame result = sqlContext.sql("SELECT rowCount from hbasedata");
long count = result.count();
System.out.println("Cound of results:"+count);
result.show();

 

Count is returning correctly, but when I try to see data in result data frame, it throws below exception:

 

java.lang.IllegalArgumentException: offset (79) + length (8) exceed the capacity of the array: 80
    at org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:629)
    at org.apache.hadoop.hbase.util.Bytes.toLong(Bytes.java:603)
    at org.apache.hadoop.hbase.util.Bytes.toDouble(Bytes.java:727)
    at org.apache.hadoop.hbase.types.RawDouble.decode(RawDouble.java:63)
    at org.apache.hadoop.hbase.spark.DefaultSourceStaticUtils$.getValue(DefaultSource.scala:920)
    at org.apache.hadoop.hbase.spark.HBaseRelation$$anonfun$14$$anonfun$apply$2.apply(DefaultSource.scala:279)
    at org.apache.hadoop.hbase.spark.HBaseRelation$$anonfun$14$$anonfun$apply$2.apply(DefaultSource.scala:278)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    at org.apache.hadoop.hbase.spark.HBaseRelation$$anonfun$14.apply(DefaultSource.scala:278)
    at org.apache.hadoop.hbase.spark.HBaseRelation$$anonfun$14.apply(DefaultSource.scala:277)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
    at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
    at scala.collection.AbstractIterator.to(Iterator.scala:1157)
    at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
    at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
    at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
    at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$5.apply(SparkPlan.scala:212)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1888)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1888)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
17/10/06 11:09:10 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 4, localhost, executor driver): java.lang.IllegalArgumentException: offset (79) + length (8) exceed the capacity of the array: 80
    at org.apache.hadoop.hbase.util.Bytes.explainWrongLengthOrOffset(Bytes.java:629)

 

Appreciate any help.

 

Thanks,

Ravi Papisetti