Below contains an example of leveraging the Hive HBaseStorageHandler for HFile generation. This pattern provides a means of taking data already stored in Hive, exporting it as HFiles, and bulk loading the HBase table from those HFiles.
The HFile generation feature was added in HIVE-6473.
It adds the following properties that are then leveraged by the Hive HBaseStorageHandler.
hive.hbase.generatehfiles - true to generate HFiles
hfile.family.path - path in HDFS to put the HFiles.
Note that for hfile.family.path, the final sudirectory MUST MATCH the column family name.
The scripts in the repo called out above can be used with the Hortonworks Sandbox to test and demo this feature.
The following is an example of how to use this feature. The scripts in the repo above implement the steps below.
It is assumed that the user already has data stored in a hive table, for the sake of this example, the following table was used.