I am writing a pyspark code where I am creating multiple data frames and using these to build subsequent data frames. I am using a filter that goes like:
to_date(transaction_date) BETWEEN greatest(to_date(loco_contract_start_date), to_date(status_asg_start_date)) \
AND LEAST(to_date(loco_contract_end_date), to_date(status_asg_end_date))
and this is where I think the problem is. every time the dataframe is giving me different counts.
Any idea what might be the solution?