Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

loading hbase table in pyspark throws "Expecting at least one region for table " error, while the table has regions

avatar
Contributor

Dear experts, 
I notice when I try to load a hbase data in pyspark, it tells me
java.io.IOException: Expecting at least one region for table : myhbasetable at org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase.getSplits(MultiTableInputFormatBase.java:195) at org.locationtech.geomesa.hbase.jobs.GeoMesaHBaseInputFormat.getSplits(GeoMesaHBaseInputFormat.scala:43) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:130)

 

 

It looks like it is telling me the table has to at least have some data in at least 1 region. 

This is the relevant piece of code:

 

-->

https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/ma...

 

 

 

try (Connection conn = ConnectionFactory.createConnection(context.getConfiguration())) {
      while (iter.hasNext()) {
        Map.Entry<TableName, List<Scan>> entry = (Map.Entry<TableName, List<Scan>>) iter.next();
        TableName tableName = entry.getKey();
        List<Scan> scanList = entry.getValue();
        try (Table table = conn.getTable(tableName);
             RegionLocator regionLocator = conn.getRegionLocator(tableName)) {
          RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(
              regionLocator, conn.getAdmin());
          Pair<byte[][], byte[][]> keys = regionLocator.getStartEndKeys();
          for (Scan scan : scanList) {
            if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) {
              throw new IOException("Expecting at least one region for table : "
                  + tableName.getNameAsString());
            }

can see in the hbase master, that his table has data spread out over 4 regions. And in hbase shell, I can scan the data with no error. This is on hbase 2.1. It seems he is not finding the fact there are regions for this table. I wonder what could cause this.


Did anyone every encounter this error?

1 ACCEPTED SOLUTION

avatar
Contributor

For future reference:
I am on a hbase cluster, and also need access to the hive metastore. It seems that in case the hive-site.xml contains some wrong values, you can have this behavior.

View solution in original post

1 REPLY 1

avatar
Contributor

For future reference:
I am on a hbase cluster, and also need access to the hive metastore. It seems that in case the hive-site.xml contains some wrong values, you can have this behavior.