I found text garbling of Japanese characters in the csv file downloaded from Hue, which is encoded and exported from Pyspark using write.save method, though there are no anomalies when I opened it through Notepad of windows.
The code for exporting CSV file is below (this code yields no errors):
######## save as csv from Pyspark dataframe directly
encd = 'cp932'
I tried .toPandas() method and found no such garbling in the csv exported from pandas dataframe.
dfp = df.limit(10)
pdf = dfp.toPandas()
pdf # displays no garbling
pdf.to_csv('data.csv', index=False, encoding='cp932')
How can I avoid this when I want to export a csv file from Pyspark dataframe directly?