09-09-2016 08:59 AM
I know that technically speaking updates in HBase do not happen, but is there a way to change the row key of certain rows without modifying the values for that row? I am trying to find the best way to perform a get, modify the row key, and then put the row back into place with the modified row key. It would also be nice if the timestamp could stay the same as the original . . . Does anyone have any examples of how to perform something like this?
09-11-2016 04:06 AM
12-11-2017 11:00 AM
I have a relatively easy solution to this problem. I just created a PairRDD of the rows that I wanted to update. then for every row, I just created a Delete and a Put Object. so, it deletes the old record and inserts a new one. the only thing that should be taken care of is the Put object should includes the new row key. than just call saveAsNewAPIHadoopDataset on the new RDD.
06-12-2018 08:37 AM
In our case, it was a matter of updating the rowkey using data that was in another row. So essentially, we grabbed the data from the "good" row, and saved it to a variable. Next, we did an HBase put using that variable like so:
Get get = new Get(Bytes.toBytes(currentRowkey));
Result result = table.get(get);
. . .
Put dataPut = createPut(Bytes.toBytes(correctedRowkey), hbaseColFamilyFile, hbaseColQualifierFile, result);
Status dataStatus = checkAndPut(currentRowkey, correctedRowkey, hbaseColFamilyFile, hbaseColQualifierFile, table, dataPut, "data");
If you'd like to get fancy, you can do a checkAndPut also like so:
table.checkAndPut(Bytes.toBytes(correctedRowkey), hbasecolfamily, hbasecolqualifier, null, put