I have a dataframe with following schema :-
scala> final_df.printSchema
root
|-- mstr_prov_id: string (nullable =true)|-- prov_ctgry_cd: string (nullable =true)|-- prov_orgnl_efctv_dt: timestamp (nullable =true)|-- prov_trmntn_dt: timestamp (nullable =true)|-- prov_trmntn_rsn_cd: string (nullable =true)|-- npi_rqrd_ind: string (nullable =true)|-- prov_stts_aray_txt: array (nullable =true)||-- element: struct (containsNull =true)|||-- PROV_STTS_KEY: string (nullable =true)|||-- PROV_STTS_EFCTV_DT: timestamp (nullable =true)|||-- PROV_STTS_CD: string (nullable =true)|||-- PROV_STTS_TRMNTN_DT: timestamp (nullable =true)|||-- PROV_STTS_TRMNTN_RSN_CD: string (nullable =true)
I am running following code to do basic cleansing but its not working inside "prov_stts_aray_txt" , basically its not going inside array type and performing transformation desire. I want to iterate through out nested all fields(Flat and nested field within Dataframe and perform basic transformation.
for(dt <- final_df.dtypes){
final_df = final_df.withColumn(dt._1,when(upper(trim(col(dt._1)))==="NULL",lit(" ")).otherwise(col(dt._1)))}
please help. Please note it's just sample DF actual DF holds multiple array struct type with different number of field in it. Hence which I need to create is in dynamic fashion.
Thanks