Support Questions

Find answers, ask questions, and share your expertise

Difference between string and character - Impala

avatar
Rising Star

Hi,

Can you please explain difference between STRING and CHAR(10) datatype in impala ?

I understand there is no need to mention number of bytes in STRING datatype.

Storage,coding and performance wise which is better ?

1 ACCEPTED SOLUTION

avatar

Hi @prakash pal there are some differences between these data types, basically string allows a variable length of characters (max 32K chars), char is a fixed length string (max. 255 chars). Usually (I doubt that this is different with Impala) CHAR is more efficient and can speed up operations and is better reg. memory allocation. (This does not mean always use CHAR)

See this => "All data in CHAR and VARCHAR columns must be in a character encoding that is compatible with UTF-8. If you have binary data from another database system (that is, a BLOB type), use a STRING column to hold it."

There are a lot of use cases where it makes sense to only use CHAR instead of STRING, e.g. lets say you want to have a column that stores the two-letter country code (ISO_3166-1_alpha-2; e.g. US, ES, UK,...), here it makes more sense to use CHAR.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

avatar

Hi @prakash pal there are some differences between these data types, basically string allows a variable length of characters (max 32K chars), char is a fixed length string (max. 255 chars). Usually (I doubt that this is different with Impala) CHAR is more efficient and can speed up operations and is better reg. memory allocation. (This does not mean always use CHAR)

See this => "All data in CHAR and VARCHAR columns must be in a character encoding that is compatible with UTF-8. If you have binary data from another database system (that is, a BLOB type), use a STRING column to hold it."

There are a lot of use cases where it makes sense to only use CHAR instead of STRING, e.g. lets say you want to have a column that stores the two-letter country code (ISO_3166-1_alpha-2; e.g. US, ES, UK,...), here it makes more sense to use CHAR.