09-27-2017 09:10 AM
I'm trying to sqoop import some latin charatcer column from teradata.
The requirement is to populate the characters similar to the source i.e Teradata ( Same character set)
PFB the sqoop command
sqoop import --connect jdbc:teradata://xxx/DATABASE=xxx--username xxx-P --query "SELECT TIN_ID,TIN_NBR,TIN_NM,TYPE_CD,TYPE_DESC,MDM_EU_ID,STATUS_TXT,SYS_OF_REC_ID,ETL_LOAD_ID,LOAD_RUN_TS,cast(begin(LOAD_EFFECT_PRD) as timestamp(6))as BEGIN_LOAD_EFFECT_PRD,cast(end(LOAD_EFFECT_PRD) as timestamp(6))as END_LOAD_EFFECT_PRD,LOAD_CUR_REC_IND,LOAD_ACTN_CD,LOAD_HASH_KEY_VAL, LOAD_HASH_DATA_VAL FROM db.table where 1=1 AND \$CONDITIONS" --fields-terminated-by '\001' --target-dir /folder -m1
I did try to mention UTF-8 or 'ISO-8859-1' in sqoop connection string, still the data is populated as below.
Source data Ê¸í¼ÿ'XôöàÆ¯´¢+ò¸¿— Ê¸í¼ÿ'XôöàÆ¯´¢+ò¸¿—
HDFS data 'Xï¿½ï¿½+ï¿½ qtï¿½;k2ï¿½Vï¿½"ï¿½R~
My question is , Is there any way to replicate the same character set from teradata in hadoop as well ?