05-12-2018 08:30 AM - last edited on 05-13-2018 06:34 AM by cjervis
Issue is pretty simple and I am not sure if there is a simple thing that I seem to be missing something pretty basic. CDH5.12 is being used.
1. Create Hive Table over HBase by specifying the numerical columns as binary.
2. Ingest data into the table through any means.
3. Query the table through hive and results are fine.
4. Impala gives an error.
1. Hive Table Created over HBase with following syntax:
hive -e "CREATE TABLE WeatherData(\
key STRING, \
wsid STRING, \
skyConditionText STRING, \
oneHourPrecip DOUBLE, \
sixHourPrecip DOUBLE) \
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'\
WITH SERDEPROPERTIES (\"hbase.columns.mapping\" = \":key, weather:wsid, weather:year#b, weather:month#b, weather:day#b, weather:hour#b, weather:temperature#b, weather:dewpoint#b, weather:pressure#b, weather:windDirection#b, weather:windSpeed#b, weather:skyCondition#b, weather:skyConditionText, weather:oneHourPrecip#b, weather:sixHourPrecip#b \")\
TBLPROPERTIES (\"hbase.table.name\" = \"WeatherData\", \"hbase.mapred.output.outputtable\" = \"WeatherData\");"
2. Ingest data into the HBase table through any preferred means.
3. Results of query through Hive are as below:
hive> SELECT * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0";
Query ID = cloudera_20180512144646_baded4e1-8a5c-4fb4-bd47-8293fc910b44
Total MapReduce CPU Time Spent: 8 seconds 10 msec
002020ed-8496-4780-8ae2-fdc820d0e4e0 725030:14732 2008 11 17 12 3.9 -4.4 1016.9 270 5.7 0 0.0 0.0 NULL
Time taken: 24.62 seconds, Fetched: 1 row(s)
4. Same query using Impala gives the below issue.
[quickstart.cloudera:21000] > SELECT * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0";
Query: select * FROM default.WeatherData WHERE key="002020ed-8496-4780-8ae2-fdc820d0e4e0"
Query submitted at: 2018-05-12 14:47:47 (Coordinator: http://quickstart.cloudera:25000)
ERROR: AnalysisException: Failed to load metadata for table: 'default.WeatherData'
CAUSED BY: TableLoadingException: Failed to load metadata for HBase table: weatherdata
CAUSED BY: SerDeException: Error: A column storage specification is one of the following: '-', a prefix of 'string', or a prefix of 'binary'. b is not a valid storage option specification for sixhourprecip
Looking into the Impala code, it seems that binary data type is not supported as column storage type. See link: https://github.com/cloudera/Impala/blob/0c713cf67959b9633d2fe6f5c21af218a43e4214/fe/src/main/java/or...
How are others dealing with this requirement?
05-15-2018 09:29 AM
You were spot ON. This issue has been fixed. Excellent pair of eyes. I am getting my eyes inspected again :).