Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase stores base64 data when data is inserted from Hive table through SerDe, Why?

avatar
Contributor

Hello there!

We create a hive external table that refers to a HBase table:

CREATE EXTERNAL TABLE  default.c_COUNTRIES_AP

(               key         BINARY,
               CF1_COUNTRY_ID_1       BINARY,
               CF1_COUNTRY_NAME_2              BINARY,
               CF1_COUNTRY_REGION_3           BINARY,
               CF1_COUNTRY_SUBREGION_4    BINARY )

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,CF1:COUNTRY_ID#s,CF1:COUNTRY_NAME#s,CF1:COUNTRY_REGION#s,CF1:COUNTRY_SUBREGION#s"
)
TBLPROPERTIES ("hbase.table.name" = "COUNTRIES_TGT")

In Hive, we insert some data into the hive external table:
insert into c_countries_ap values ('ZA', 'ZA', 'South Africa', 'Africa', 'Africa');

select * from c_countries_ap shows correct data is inserted

But when we go to HBase shell, and do a SCAN ‘COUNTRIES_TGT’, it shows:

hbase:063:0> scan 'COUNTRIES_TGT'
ROW    COLUMN+CELL                                                                                                                 

WkE= column=CF1:COUNTRY_ID, timestamp=1716416439777, value=WkE=                                                                  
WkE= column=CF1:COUNTRY_NAME, timestamp=1716416439777, value=U291dGggQWZyaWNh 
WkE= column=CF1:COUNTRY_REGION, timestamp=1716416439777, value=QWZyaWNh  
WkE= column=CF1:COUNTRY_SUBREGION, timestamp=1716416439777, value=QWZyaWNh             

The data in HBase is all base64 encoded.

If we do SQOOP import into HBase table, the data in HBase table is not encoded.

This only happens when data is inserted from Hive table through SerDe.

We are wondering if there are any configuration parameters need to changed?

 

1 ACCEPTED SOLUTION

avatar
Master Collaborator

@Marks_08 The encoding is being performed by HBase SerDe, for the binary data. Could you change the col data type to STRING if your data does not really require binary storage?

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

@Marks_08 The encoding is being performed by HBase SerDe, for the binary data. Could you change the col data type to STRING if your data does not really require binary storage?

avatar
Contributor

Thanks @smruti 
I changed data type to String and data is stored without encoding as I want. 

Is there any documentation where I can find "Why the encoding is being performed by HBase SerDe, for the binary data"?

regards,
Marks