loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions

JB0000000000001 — Wed, 11 Aug 2021 14:25:10 GMT

Dear experts,
I notice when I try to load a hbase data in pyspark, it tells me
java.io.IOException: Expecting at least one region for table : myhbasetable at org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase.getSplits(MultiTableInputFormatBase.java:195) at org.locationtech.geomesa.hbase.jobs.GeoMesaHBaseInputFormat.getSplits(GeoMesaHBaseInputFormat.scala:43) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:130)

It looks like it is telling me the table has to at least have some data in at least 1 region.

This is the relevant piece of code:

-->

https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java

try (Connection conn = ConnectionFactory.createConnection(context.getConfiguration())) { while (iter.hasNext()) { Map.Entry<TableName, List<Scan>> entry = (Map.Entry<TableName, List<Scan>>) iter.next(); TableName tableName = entry.getKey(); List<Scan> scanList = entry.getValue(); try (Table table = conn.getTable(tableName); RegionLocator regionLocator = conn.getRegionLocator(tableName)) { RegionSizeCalculator sizeCalculator = new RegionSizeCalculator( regionLocator, conn.getAdmin()); Pair<byte[][], byte[][]> keys = regionLocator.getStartEndKeys(); for (Scan scan : scanList) { if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) { throw new IOException("Expecting at least one region for table : " + tableName.getNameAsString()); }

can see in the hbase master, that his table has data spread out over 4 regions. And in hbase shell, I can scan the data with no error. This is on hbase 2.1. It seems he is not finding the fact there are regions for this table. I wonder what could cause this.

Did anyone every encounter this error?

Re: loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions

JB0000000000001 — Thu, 27 Jan 2022 09:11:34 GMT

For future reference:
I am on a hbase cluster, and also need access to the hive metastore. It seems that in case the hive-site.xml contains some wrong values, you can have this behavior.

question Re: loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions in Support Questions

loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions

Re: loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions