Created 08-28-2017 08:04 AM
Created 08-28-2017 01:54 PM
Could you provide details on how you create the external table (DDL as file attachment would be good) and a few examples of the data values.
Created 08-28-2017 02:45 PM
Please share the Hbase table DDL, the hive table DDL and the query/process by which you are feeding data to hbase from hive.
Created 08-28-2017 04:58 PM
As previously said: providing more tech details would help.
Meanwhile, I believe that the issue you have is related to the way you create Hive table and its columns' types.
Take a look on this page: https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
CREATE TABLE hbase_table_1 (key int, value string, foobar double) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key,cf:val#s,cf:foo", "hbase.table.default.storage.type" = "binary" );
If your default storage type is string, then:
CREATE TABLE hbase_table_1 (key int, value string, foobar double) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( "hbase.columns.mapping" = ":key#b,cf:val,cf:foo#b" );
"#b" means - the field is binary field, that will help with numbers
Created 08-29-2017 05:07 PM
@Gayathri Devi Instead of using the org.apache.hadoop.hive.hbase.HBaseStorageHandler, try using org.apache.phoenix.hive.PhoenixStorageHandler.
create external table nrd_app_spt.capa(hbid string,MATRIX_col string,MATRIX_VALUE_col double,COUNTRY_col string)
STORED BY 'org.apache.phoenix.hive.PhoenixStorageHandler'
TBLPROPERTIES ( "phoenix.table.name" = "nrd_app_spt.capa", "phoenix.zookeeper.quorum" = "localhost", "phoenix.zookeeper.znode.parent" = "/hbase", "phoenix.zookeeper.client.port" = "2181", "phoenix.rowkeys" = "hbid", "phoenix.column.mapping" = "hbid:hbid, MATRIX_col:MATRIX_col, MATRIX_VALUE_col:MATRIX_VALUE_col,COUNTRY_col:COUNTRY_col" );
Please update ZK Quorum, client port as per your cluster.
Created 08-30-2017 03:19 AM
You can try by using the NTILE function after some CTE or subquery to suit your requirement https://www.cloudera.com/documentation/enterprise/5-6-x/topics/impala_analytic_functions.html#ntile
Currently there is no direct percentile function as noted in this JIRA - https://issues.apache.org/jira/browse/IMPALA-3602
Created 08-30-2017 04:29 AM
As mentioned earlier, you can use the NTILE function.
Pls read the example of using Ntile on this page.
https://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_analytic_functions.html#ntile
To write your own UDF/UDAF for Impala you can refer this blog : https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_udf.html
Created 08-30-2017 01:54 PM
@Gayathri Devi, just to add to Dinesh's comment: you could use Hive UDF for percentile and adopt it for Impala. Hive percentile UDF: https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/UDAFPercentile....