Member since
06-09-2022
4
Posts
0
Kudos Received
0
Solutions
06-30-2022
05:05 AM
so plainfield, s plainfiled both are same
... View more
06-30-2022
05:05 AM
Thanks jagadeesan, But Still your getting the duplicate values
... View more
06-09-2022
06:54 PM
I have a pyspark dataframe with names like N. Plainfield North Plainfield West Home Land NEWYORK newyork So. Plainfield S. Plaindield Some of them contain dots and spaces between initials and some do not. How can they be converted to: n Plainfield north plainfield west homeland newyork newyork so plainfield s plainfield (with no dots and spaces between initials and 1 space between initials and name) I tried using the following but it only replaces dots and doesn't remove spaces between initials: names_modified = names.withColumn("name_clean", regexp_replace("name", r"\.","")) After removing the whitespaces and dots is there any way get the distinct values. like this. north plainfield west homeland newyork so plainfield
... View more
Labels:
- Labels:
-
Apache Spark