Dear experts,
I notice when I try to load a hbase data in pyspark, it tells me
java.io.IOException: Expecting at least one region for table : myhbasetable at org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase.getSplits(MultiTableInputFormatBase.java:195) at org.locationtech.geomesa.hbase.jobs.GeoMesaHBaseInputFormat.getSplits(GeoMesaHBaseInputFormat.scala:43) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:130)
It looks like it is telling me the table has to at least have some data in at least 1 region.
This is the relevant piece of code:
-->
https://github.com/apache/hbase/blob/master/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/ma...
try (Connection conn = ConnectionFactory.createConnection(context.getConfiguration())) {
while (iter.hasNext()) {
Map.Entry<TableName, List<Scan>> entry = (Map.Entry<TableName, List<Scan>>) iter.next();
TableName tableName = entry.getKey();
List<Scan> scanList = entry.getValue();
try (Table table = conn.getTable(tableName);
RegionLocator regionLocator = conn.getRegionLocator(tableName)) {
RegionSizeCalculator sizeCalculator = new RegionSizeCalculator(
regionLocator, conn.getAdmin());
Pair<byte[][], byte[][]> keys = regionLocator.getStartEndKeys();
for (Scan scan : scanList) {
if (keys == null || keys.getFirst() == null || keys.getFirst().length == 0) {
throw new IOException("Expecting at least one region for table : "
+ tableName.getNameAsString());
}
can see in the hbase master, that his table has data spread out over 4 regions. And in hbase shell, I can scan the data with no error. This is on hbase 2.1. It seems he is not finding the fact there are regions for this table. I wonder what could cause this.
Did anyone every encounter this error?