I set up a 5 nodes cluster with HDP 2.5 as instructed in the official document. However, it seems Chinese character cannot be stored and displayed correctly. Neither utf-8 txt file in HDFS nor table in HIVE, Chinese characters are always displayed as question marks. Furthermore I did 'select * from xxx' in terminal and didn't get luck.
Can anyone help?
Thanks in advance!
Check your locale on your terminal, if on Linux check "echo $LANG", does it end in UTF-8? You can store in HDFS any data, it only depends how are you going to interpret it for display. Hive by default supports UTF-8, but can read other encodings as well.