Reply
Highlighted
Explorer
Posts: 14
Registered: ‎01-26-2015
Accepted Solution

Impala is not able to query hive table over HBase with binary column.

[ Edited ]

Issue is pretty simple and I am not sure if there is a simple thing that I seem to be missing something pretty basic. CDH5.12 is being used.

 

1. Create Hive Table over HBase by specifying the numerical columns as binary.

2. Ingest data into the table through any means.

3. Query the table through hive and results are fine.

4. Impala gives an error.

 

1. Hive Table Created over HBase with following syntax:

 

hive -e "CREATE TABLE WeatherData(\
                key  STRING, \
                wsid STRING, \
                year INT,\
                month INT,\
                day INT,\
                hour INT,\
                temperature DOUBLE,\
                dewpoint DOUBLE,\
                pressure DOUBLE,\
                windDirection INT,\
                windSpeed DOUBLE,\
                skyCondition INT,\
                skyConditionText STRING, \
                oneHourPrecip DOUBLE, \
                sixHourPrecip DOUBLE) \
        STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'\
        WITH SERDEPROPERTIES (\"hbase.columns.mapping\" = \":key, weather:wsid, weather:year#b, weather:month#b, weather:day#b, weather:hour#b, weather:temperature#b, weather:dewpoint#b, weather:pressure#b, weather:windDirection#b, weather:windSpeed#b, weather:skyCondition#b, weather:skyConditionText, weather:oneHourPrecip#b, weather:sixHourPrecip#b \")\
        TBLPROPERTIES (\"hbase.table.name\" = \"WeatherData\", \"hbase.mapred.output.outputtable\" = \"WeatherData\");"

 

2. Ingest data into the HBase table through any preferred means.

 

3. Results of query through Hive are as below:

 

hive> SELECT * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0";
Query ID = cloudera_20180512144646_baded4e1-8a5c-4fb4-bd47-8293fc910b44

Total MapReduce CPU Time Spent: 8 seconds 10 msec
OK
002020ed-8496-4780-8ae2-fdc820d0e4e0    725030:14732    2008    11    17    12    3.9    -4.4    1016.9    270    5.7    0    0.0    0.0    NULL
Time taken: 24.62 seconds, Fetched: 1 row(s)

 

4. Same query using Impala gives the below issue.

 

[quickstart.cloudera:21000] > SELECT * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0";
Query: select * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0"
Query submitted at: 2018-05-12 14:47:47 (Coordinator: http://quickstart.cloudera:25000)
ERROR: AnalysisException: Failed to load metadata for table: 'default.WeatherData'
CAUSED BY: TableLoadingException: Failed to load metadata for HBase table: weatherdata
CAUSED BY: SerDeException: Error: A column storage specification is one of the following: '-', a prefix of 'string', or a prefix of 'binary'. b  is not a valid storage option specification for sixhourprecip

 

 

Looking into the Impala code, it seems that binary data type is not supported as column storage type. See link: https://github.com/cloudera/Impala/blob/0c713cf67959b9633d2fe6f5c21af218a43e4214/fe/src/main/java/or...

 

How are others dealing with this requirement? 

New Contributor
Posts: 4
Registered: ‎01-16-2018

Re: Impala is not able to query hive table over HBase with binary column.

Hi,

 

I suppose the issue is that you have an additional space character in your query - right after "weather:sixHourPrecip#b"

Explorer
Posts: 14
Registered: ‎01-26-2015

Re: Impala is not able to query hive table over HBase with binary column.

Vilyam

 

You were spot ON. This issue has been fixed. Excellent pair of eyes. I am getting my eyes inspected again :).

 

Thanks,

Kabeer.

Announcements