04-11-2016 02:22 AM
We are trying to query a table which was created in hive using the OpenCSVSerde but we are hitting the below error. As far as we know, this comes default with CDH installation and Impala should support it.
Any reason why we are not able to query the table?
Query: select * from master_staging.rms_dxc_data_mc_cal_reps limit 5
ERROR: AnalysisException: Failed to load metadata for table: 'master_staging.rms_dxc_data_mc_cal_reps'
CAUSED BY: TableLoadingException: Failed to load metadata for table: master_staging.rms_dxc_data_mc_cal_reps
CAUSED BY: InvalidStorageDescriptorException: Impala does not support tables of this type. REASON: SerDe library 'org.apache.hadoop.hive.serde2.OpenCSVSerde' is not supported.
08-21-2017 11:42 AM
Impala doesn't support this Hive SerDe. In general Impala uses it's own optimised parsing code instead of using Hive's SerDe infrastructure. If you're ingesting data from CSV and using the SerDe to do the conversion, I'd recommend using Hive to do the ETL to convert to a more efficient storage format, e.g. Parquet.
10-07-2017 11:36 AM
I tried loading a .parquet file using the Metastore Mnager.
When tried to query the table using Impala editor, I am able to see the contets of the table.
The table query works in Impala with .parquet file loaded in Hive.