Support Questions
Find answers, ask questions, and share your expertise

Characters not being handled correctly


Characters not being handled correctly

I am facing a case of data discrepancy/mismatch. There are some printable characters in the source table which after import to hive tables are getting converted to a "question mark(?)" I have tried the following options so far:

  • "serialization.encoding"='ISO-8859-1')
    TBLPROPERTIES ( 'store.charset'='ISO-8859-1',
  • 'serialization.encoding'='UTF-8'

However, the issue isnt getting resolved. Please note that the file which I am trying to import to my hive table has been brought via ndm(mainframe file) to hdfs and is in binary format. The framework converts the binary to readable formats internally. Is there anything else I can try ? Please suggest if anyone has come across such a scenario . Thanks in advance.