Support Questions

Find answers, ask questions, and share your expertise

Remove Leading zeros from column in Dataframe Join Expression in Spark-Scala

avatar
Rising Star

Need to remove leading zeros in a join expression

DF1(TradeID) has values like "0000012345"

and DF2(TradeRefNo) has no leading zeros i.e. "12345"

 

val resultDf = Df1.join(Df2, Df1("TradeID") === Df2(TradeRefNo"))

What's the best way to remove the leading from the first dataframe's DF1("TradeID") column values so the compare works correctly

2 REPLIES 2

avatar
Super Guru
@ChineduLB

Have you tried to create another DF and cast the values to integer first before the JOIN?

Cheers
Eric

avatar
Rising Star

I ended up creating a new column in new data frame via withColumn and used regex to populate the new column with the trimmed vals

 

thanks