Member since
09-03-2018
1
Post
1
Kudos Received
0
Solutions
09-03-2018
12:20 AM
1 Kudo
Hi, I also faced similar issues while applying regex_replace() to only strings columns of a dataframe. The trick is to make regEx pattern (in my case "pattern") that resolves inside the double quotes and also apply escape characters. val pattern="\"\\\\[\\\\x{FFFD}\\\\x{0}\\\\x{1F}\\\\x{81}\\\\x{8D}\\\\x{8F}\\\\x{90}\\\\x{9D}\\\\x{A0}\\\\x{0380}\\\\]\"" val regExpr = parq_out_df_temp.schema.fields.map(x => if(x.dataType == StringType){"regexp_replace(%s, $pattern,'') as %s".format(x.name,x.name)} else x.name) val parq_out=parq_out_df_temp.selectExpr(regExpr:_*) This worked fine for me !!
... View more