Posts: 6
Registered: ‎09-20-2017

Sqoop import latin characters as similar charset from Teradata

I'm trying to sqoop import some latin charatcer column from teradata.

The requirement is to populate the characters similar to the source i.e Teradata ( Same character set)


PFB the sqoop command

sqoop import --connect jdbc:teradata://xxx/DATABASE=xxx--username xxx-P --query "SELECT TIN_ID,TIN_NBR,TIN_NM,TYPE_CD,TYPE_DESC,MDM_EU_ID,STATUS_TXT,SYS_OF_REC_ID,ETL_LOAD_ID,LOAD_RUN_TS,cast(begin(LOAD_EFFECT_PRD) as timestamp(6))as BEGIN_LOAD_EFFECT_PRD,cast(end(LOAD_EFFECT_PRD) as timestamp(6))as END_LOAD_EFFECT_PRD,LOAD_CUR_REC_IND,LOAD_ACTN_CD,LOAD_HASH_KEY_VAL, LOAD_HASH_DATA_VAL FROM db.table  where 1=1 AND \$CONDITIONS" --fields-terminated-by '\001' --target-dir /folder -m1

I did try to mention UTF-8 or 'ISO-8859-1' in sqoop connection string, still the data is populated as below.

Source data ʸí¼ÿ'XôöàƯ´¢+ò¸¿— ʸí¼ÿ'XôöàƯ´¢+ò¸¿—
HDFS data 'X��+� qt�;k2�V�"�R~



My question is , Is there any way to replicate the same character set from teradata in hadoop as well ?