New Contributor
Posts: 1
Registered: ‎08-13-2018

CSV file exported from Pyspark dataframe and downloaded from Hue UI shows text garbling (Japanese)

[ Edited ]

I found text garbling of Japanese characters in the csv file downloaded from Hue, which is encoded and exported from Pyspark using method, though there are no anomalies when I opened it through Notepad of windows. 

The code for exporting CSV file is below (this code yields no errors): 

######## save as csv from Pyspark dataframe directly 
encd = 'cp932' 


I tried .toPandas() method and found no such garbling in the csv exported from pandas dataframe.

dfp = df.limit(10)
pdf = dfp.toPandas()

pdf # displays no garbling

pdf.to_csv('data.csv', index=False, encoding='cp932')


How can I avoid this when I want to export a csv file from Pyspark dataframe directly?