Member since
08-08-2016
17
Posts
1
Kudos Received
0
Solutions
05-21-2021
12:36 AM
Hi @Reza77, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
02-21-2018
10:11 PM
I found the correct answer like make dataframe in pyspark left join with itself and filter nulls and not nulls, put everything into a loop until df.take(1)=[] null. --------------------------------------------------------------------------- df = df_tr.select(col("old_id"),col("new__id")).distinct() df2 = df df_tr = spark.createDataFrame([('null', 'null')], while (df.take(1) != []): df = df.alias("df1").join(df2.alias("df2"), col('df1.new__id') == col('df2.old_id'),
'left_outer') df_null = df.filter(col('df2.new_id').isNull()).select(col('df1.old_id'),
col('df1.new_id')) df = df.filter(col('df2.new_id').isNotNull()).select(col('df1.old_id'),
col('df2.new_id')) df_tr = df_null.union(df_tracking)
... View more
10-20-2016
02:41 PM
Thank you Emily for your reply.
... View more
08-09-2016
09:38 PM
You're more than welcome @Ashish Vishnoi. If it was helping, and it is appropriate, I'd sure appreciate you marking my response as "Best Answer" to help me build up my points. 😉
... View more