Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Difference between string and character - Impala

Solved Go to solution
Highlighted

Difference between string and character - Impala

Explorer

Hi,

Can you please explain difference between STRING and CHAR(10) datatype in impala ?

I understand there is no need to mention number of bytes in STRING datatype.

Storage,coding and performance wise which is better ?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Difference between string and character - Impala

Hi @prakash pal there are some differences between these data types, basically string allows a variable length of characters (max 32K chars), char is a fixed length string (max. 255 chars). Usually (I doubt that this is different with Impala) CHAR is more efficient and can speed up operations and is better reg. memory allocation. (This does not mean always use CHAR)

See this => "All data in CHAR and VARCHAR columns must be in a character encoding that is compatible with UTF-8. If you have binary data from another database system (that is, a BLOB type), use a STRING column to hold it."

There are a lot of use cases where it makes sense to only use CHAR instead of STRING, e.g. lets say you want to have a column that stores the two-letter country code (ISO_3166-1_alpha-2; e.g. US, ES, UK,...), here it makes more sense to use CHAR.

View solution in original post

2 REPLIES 2
Highlighted

Re: Difference between string and character - Impala

Highlighted

Re: Difference between string and character - Impala

Hi @prakash pal there are some differences between these data types, basically string allows a variable length of characters (max 32K chars), char is a fixed length string (max. 255 chars). Usually (I doubt that this is different with Impala) CHAR is more efficient and can speed up operations and is better reg. memory allocation. (This does not mean always use CHAR)

See this => "All data in CHAR and VARCHAR columns must be in a character encoding that is compatible with UTF-8. If you have binary data from another database system (that is, a BLOB type), use a STRING column to hold it."

There are a lot of use cases where it makes sense to only use CHAR instead of STRING, e.g. lets say you want to have a column that stores the two-letter country code (ISO_3166-1_alpha-2; e.g. US, ES, UK,...), here it makes more sense to use CHAR.

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here