We have written a PIG Java UDF which fetches a record one at a time from hbase:
Configuration config = HBaseConfiguration.create();
// This instantiates an HTable object that connects you to
HTable table = new HTable(config, hbaseTable);
// Get the Hbase row for the corresponding rowkey
Get hbaseRow = new Get(Bytes.toBytes(rowkey));
Result resultRow = table.get(hbaseRow);
However when running and using our UDF in Pig, it cannot find the hbase-site.xml and searches for our zookeeperQuorum on localhost instead of what is specified in the Config file.
However when we use the PiggyBank HbaseStorage, we don't have any problems. I have tried setting PIG_CLASSPATH, PIG_OPTS, etc. However it doesn't work.
I would appreciate your help!
I had a same problem with my MR job responsible for dumping data from HBase. All you need is to set variable hbase.zookeeper.quorum somewhere. I think you can use job properties to specify it.
Hope it helps a little
Thanks for tip!
This indeed works, but the solution we are searching for is to have bhase-site.xml picked up automatically for our different environments.
I submitted a bug by the way to the Hbase JIRA:
Lets hope we can fix this quickly.
I got reply on JIRA that hbase-site.xml is read from HBASE_CLASSPATH. I have set this environment variable in hbase-env.sh, but it did not work successfully yet.
"hbase-site.xml is read from the HBASE_CLASSPATH.