Support Questions
Find answers, ask questions, and share your expertise

How to load hbase bytes formatted data into PIG?

How to load hbase bytes formatted data into PIG?

Expert Contributor

Hello everyone,

I have a table in hbase and it is having more than 1000 rows.

I am trying to load the hbase table in pig, but i found that the row names are in bytes format.

When i tried to load the hbase table using hbasestorage package and i tried to store output it in a csv file but it has lot of binary data.

So how can i convert the hbase binary data and store it into csv file.

Please suggest me.

2 REPLIES 2
Highlighted

Re: How to load hbase bytes formatted data into PIG?

Hi @Mohan V,

If you have non-text values in the columns, you need to specify the -caster option with HBaseBinaryConverter(default is Utf8StorageConverter) and map them to respective types so that PIG will cast them properly before serializing them in text.

a = load 'hbase://TESTTABLE_1' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('TESTCOLUMN_A TESTCOLUMN_B TESTCOLUMN_C ','-loadKey -caster HBaseBinaryConverter') as (rowKey:chararray,col_a:int, col_b:double, col_c:chararray);
Highlighted

Re: How to load hbase bytes formatted data into PIG?

Expert Contributor

tried what you have suggested Ankit Singhal,

but still getting same issue.

a = load 'hbase://tablename' using org.apache.pig.backend.hadoop.hbase.HBaseStorage('cd','-loadKey -caster HBaseBinaryConverter') as (rowKey:chararray,cd:map[]);

output:

(���,[sicd#Commercial Nonphysical Research,parent_cid#ؗ,parent_cn#Algorithmics (UK) Ltd.,emp#,industry#Corporate Services,sic#8732,equifaxId#,subIndustry#Market Research Services,revenue#,cct#London,street#101 Finsbury Pavement,ic#12,state#England,fax#44 20 7862 4008,ultimate_pcoId# �,parent_ccnt#United Kingdom,zip#ec2a 1rs,cnt#United Kingdom,ultimate_pconame#International Business Machines Corp.,subsidiary#�,ultimate_pcocn#United States,cs#Subsidiary,ct#Private,rc#USD,phone#+44 20 7862 4000,naicsDescription#Marketing Research and Public Opinion Polling,name#Algorithmics Risk Management Limited,naics#541910,fd#2002])

please suggest me.