Support Questions

Find answers, ask questions, and share your expertise

applying function on all column spark

Explorer

I have dobe this code ,my question is for the function cast data type ,how can i cast all columns'datatype included in dataset at the same time except the column timestamp, and the other question is how to apply function avg on all column except also column timestamp,thanks a lot .

val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:/Users/mhattabi/Desktop/dataTest.csv")
val result=df.withColumn("new_time",((unix_timestamp(col("time")) /300).cast("long") * 300).cast("timestamp"))
result("value").cast("float")//here the first question 
val finalresult=result.groupBy("new_time").agg(avg("value")).sort("new_time")//here the second question about avg
finalresult.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").save("C:/mydata.csv")
1 REPLY 1

New Contributor

For the 1st questions only, using Spark SQL to cast all the columns, perhaps something like:

DFtable.select([col(i).cast("long") for i in DFtable.columns])