Hi On CDH 5.9.1 we're having an issue with an extranel table definition in Hive to an Hbase table with a lot of columns The hbase.columns.mapping in the external table definition is longer than 4000 characters and causes an error: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Put request failed : INSERT INTO SERDE_PARAMS (PARAM_VALUE,SERDE_ID,PARAM_KEY) VALUES (?,?,?) ) Hive Metastore is using an Oracle backend to store table definitions. On Oracle 12c the table is created with datatype varchar2, which is limited by default to 4000 characters. There seem to be two solutions: 1. change max_string_size on Oracle from standard to extended 2. change datatype from varchar2 to clob (using add column, copy data, drop old column, rename new column) My preference would be to use option 2, because this only applies to the affected tables, while the max_string_size parameter affects the whole instance and can't be rolled back in case of issues. But the clob datatype only works with regular selects, it can't handle functions like to_char, substr, instr Tables affected: - COLUMNS_V2 - TABLE_PARAMS - SERDE_PARAMS - SD_PARAMS Any reason to avoid changing the datatype from varchar2 to clob for these Hive Metastore tables in the Oracle backend? thanks
It is better to extend the varchar's length from default 4000 to, say 10000, or 20000, than changing it to clob. The reason being that VARCHAR type is well tested and supported, but CLOB is not, so we won't be sure what issues you will hit if you change the data type in the underlining database.
Oracle varchar2 has a maximum size of 4000 characters
On Oracle 12c a new parameter is introduced: max_string_size
the default value is "standard", which limits varchar2 to 4000 characters
It can be changed to "extended", which would allow varchar2 to be 32k, but that affects the whole instance and can't be reverted back to "standard". in case of any issues you're stuck with this parameter
This is a new feature, there's a change for bugs or side effects using this feature
Besides that, latest Apache Hive is using CLOB instead of varchar2(4000), so that's my preferred approach