Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

insert urdu data in hive using odbc

insert urdu data in hive using odbc

New Contributor

hi i am trying to insert urdu data in hive using odbc driver,but it converts it into junk characters , how can i save urdu language data in hive using hdfs , , thanks

3 REPLIES 3
Highlighted

Re: insert urdu data in hive using odbc

Super Collaborator

I have no experience with urdu, but the creation of junk characters is typically a result of either using encodings on the target not supporting your characters or interpreting the source as the wrong encoding.

In your case, hive should use utf-8 by default (which supports urdu), but it is possible that the encoding was changed when the hive table was created. Can you also verify in the client session (where you want to insert data into hive), the session encoding is configured correctly?

And how do you determine the junk characters? Did you run a query on hive? Using what client tool, and how is the encoding defined there (utf-8 should be the correct one).

Highlighted

Re: insert urdu data in hive using odbc

New Contributor

hi harald thanks for you responce
yes default encoding is utf8 , i am running insert query on hadoop server =>hive(opened in terminal) , on ubantu machine

Highlighted

Re: insert urdu data in hive using odbc

Super Collaborator

can you check what the output on your ubuntu terminal of 'locale' is? On my machine it looks like this:

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
$

not sure if you have a source file that you want to store on hive, but if there is a file, you can check the assumed encoding with

file -i <your file>
Don't have an account?
Coming from Hortonworks? Activate your account here