We recently upgraded from HDP 2.4.2 to 2.6.1, which took us from Phoenix 4.4.X to 4.7.X
We have a regular process for bulk-loading data from Hive to Phoenix.
When I upgraded the phoenix driver on the client machine that runs this process, I ran into an issue:
HTable.GetTableDescriptor() throws a big stack trace and the process stops.
The error points to a “Table” that was really an index that we dropped a while ago.
So I have table X with index Y, and when I try to bulk import to table X, I get an error that table Y does not exist.
This is true – it does not exist. So why is the bulk-load process concerned with it?
If I revert to the older driver file, I get a similar error, but the import process then proceeds.
I don’t NEED to use the new driver. But I was hoping that I would see some performance improvement.
Is it worth it to wrestle with this?
Can someone show me how to fix it? I’m sure it’s just a simple query against the system table(s)…
At this point, I replaced the new client jar with the old one - I'll have to wait for some cluster downtime to reproduce the exact issue.
But here's the stack trace I get with the older client jar, and I believe it's pretty much the same.
For context, I'm attempting a bulk load to a table, and ER_V13_SPLIT_511_GZ_V2_METERKEY USED TO BE an index on that table. It no longer exists.
17/07/10 00:30:28 ERROR mapreduce.CsvBulkLoadTool: Import job on table=ER_V13_SPLIT_511_GZ_V2_METERKEY failed due to exception. org.apache.hadoop.hbase.TableNotFoundException: ER_V13_SPLIT_511_GZ_V2_METERKEY at org.apache.hadoop.hbase.client.HTable.getTableDescriptor(HTable.java:597) at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat.configureIncrementalLoad(HFileOutputFormat.java:91) at org.apache.phoenix.mapreduce.CsvBulkLoadTool$TableLoader.call(CsvBulkLoadTool.java:512) at org.apache.phoenix.mapreduce.CsvBulkLoadTool$TableLoader.call(CsvBulkLoadTool.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
have you confirmed that "ER_V13_SPLIT_511_GZ_V2_METERKEY" doesn't appear when you do !tables on sqlline. (or you can query system.catalog if any link for this index exists)
For a workaround:- you can try creating "ER_V13_SPLIT_511_GZ_V2_METERKEY" from HBase shell using same table descriptor of data table but with just different table name.