Support Questions
Find answers, ask questions, and share your expertise

HBase to Hive showing null values for decimal data type

 
7 REPLIES 7

Guru

Could you provide details on how you create the external table (DDL as file attachment would be good) and a few examples of the data values.

@Gayathri Devi

Please share the Hbase table DDL, the hive table DDL and the query/process by which you are feeding data to hbase from hive.

Expert Contributor

@Gayathri Devi

As previously said: providing more tech details would help.

Meanwhile, I believe that the issue you have is related to the way you create Hive table and its columns' types.

Take a look on this page: https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

CREATE TABLE hbase_table_1 (key int, value string, foobar double)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,cf:val#s,cf:foo",
"hbase.table.default.storage.type" = "binary"
);

If your default storage type is string, then:

CREATE TABLE hbase_table_1 (key int, value string, foobar double)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key#b,cf:val,cf:foo#b"
);

"#b" means - the field is binary field, that will help with numbers

@Gayathri Devi Instead of using the org.apache.hadoop.hive.hbase.HBaseStorageHandler, try using org.apache.phoenix.hive.PhoenixStorageHandler.

create external table nrd_app_spt.capa(hbid string,MATRIX_col string,MATRIX_VALUE_col double,COUNTRY_col string)

STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'

TBLPROPERTIES ( "phoenix.table.name" = "nrd_app_spt.capa", "phoenix.zookeeper.quorum" = "localhost", "phoenix.zookeeper.znode.parent" = "/hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "hbid", "phoenix.column.mapping" = "hbid:hbid, MATRIX_col:MATRIX_col, MATRIX_VALUE_col:MATRIX_VALUE_col,COUNTRY_col:COUNTRY_col" );

Please update ZK Quorum, client port as per your cluster.

@Gayathri Devi

You can try by using the NTILE function after some CTE or subquery to suit your requirement https://www.cloudera.com/documentation/enterprise/5-6-x/topics/impala_analytic_functions.html#ntile

Currently there is no direct percentile function as noted in this JIRA - https://issues.apache.org/jira/browse/IMPALA-3602

@Gayathri Devi

As mentioned earlier, you can use the NTILE function.

Pls read the example of using Ntile on this page.

https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_analytic_functions.html#ntile

To write your own UDF/UDAF for Impala you can refer this blog : https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_udf.html

Expert Contributor

@Gayathri Devi, just to add to Dinesh's comment: you could use Hive UDF for percentile and adopt it for Impala. Hive percentile UDF: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDAFPercentile....

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.