Scenario: Two hive tables with data from entirely different systems created and loaded with data. Need to merge the 2 tables into a single table by performing full outer join which obviously have null values. This merged table has been planned to be created with a partition column. Now with null coming into picture, thinking of removing partition column from here.
Now the real question is, how to integrate this merged hive table with Hbase. It has nulls and Hbase doesn't understand null and its not possible to retrieve null data (since its full join, even the primary key column also contains null). How to integrate this hive table with hbase table. All columns and rows must be preserved and null value rows should not be filtered out. Is there any way to achieve this? Or should we be thinking of some other alternative? Any other out of box alternatives are most welcome.
I think it should work just like you would expect a normal full outer join to work. When there is no match, you get all null values from HBase table and when there is match, you get the values returned. See the following link (this is just general full outer join). I think I am still missing why you think this will not work just because HBase does not have concept of null. You are joining on row key, right?