- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HBase stores base64 data when data is inserted from Hive table through SerDe, Why?
- Labels:
-
Apache HBase
-
Apache Hive
Created 05-28-2024 09:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello there!
We create a hive external table that refers to a HBase table:
CREATE EXTERNAL TABLE default.c_COUNTRIES_AP
( key BINARY,
CF1_COUNTRY_ID_1 BINARY,
CF1_COUNTRY_NAME_2 BINARY,
CF1_COUNTRY_REGION_3 BINARY,
CF1_COUNTRY_SUBREGION_4 BINARY )
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,CF1:COUNTRY_ID#s,CF1:COUNTRY_NAME#s,CF1:COUNTRY_REGION#s,CF1:COUNTRY_SUBREGION#s"
)
TBLPROPERTIES ("hbase.table.name" = "COUNTRIES_TGT")
In Hive, we insert some data into the hive external table:
insert into c_countries_ap values ('ZA', 'ZA', 'South Africa', 'Africa', 'Africa');
select * from c_countries_ap shows correct data is inserted
But when we go to HBase shell, and do a SCAN ‘COUNTRIES_TGT’, it shows:
hbase:063:0> scan 'COUNTRIES_TGT'
ROW COLUMN+CELL
WkE= column=CF1:COUNTRY_ID, timestamp=1716416439777, value=WkE=
WkE= column=CF1:COUNTRY_NAME, timestamp=1716416439777, value=U291dGggQWZyaWNh
WkE= column=CF1:COUNTRY_REGION, timestamp=1716416439777, value=QWZyaWNh
WkE= column=CF1:COUNTRY_SUBREGION, timestamp=1716416439777, value=QWZyaWNh
The data in HBase is all base64 encoded.
If we do SQOOP import into HBase table, the data in HBase table is not encoded.
This only happens when data is inserted from Hive table through SerDe.
We are wondering if there are any configuration parameters need to changed?
Created 05-28-2024 11:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Marks_08 The encoding is being performed by HBase SerDe, for the binary data. Could you change the col data type to STRING if your data does not really require binary storage?
Created 05-28-2024 11:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Marks_08 The encoding is being performed by HBase SerDe, for the binary data. Could you change the col data type to STRING if your data does not really require binary storage?
Created 05-28-2024 12:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @smruti
I changed data type to String and data is stored without encoding as I want.
Is there any documentation where I can find "Why the encoding is being performed by HBase SerDe, for the binary data"?
regards,
Marks
![](/skins/images/C3EF05C688F0C29C1D3298241F61C2B3/responsive_peak/images/icon_anonymous_message.png)