Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

applying function on all column spark

applying function on all column spark

Explorer

I have dobe this code ,my question is for the function cast data type ,how can i cast all columns'datatype included in dataset at the same time except the column timestamp, and the other question is how to apply function avg on all column except also column timestamp,thanks a lot .

val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:/Users/mhattabi/Desktop/dataTest.csv")
val result=df.withColumn("new_time",((unix_timestamp(col("time")) /300).cast("long") * 300).cast("timestamp"))
result("value").cast("float")//here the first question 
val finalresult=result.groupBy("new_time").agg(avg("value")).sort("new_time")//here the second question about avg
finalresult.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").save("C:/mydata.csv")
1 REPLY 1
Highlighted

Re: applying function on all column spark

New Contributor

For the 1st questions only, using Spark SQL to cast all the columns, perhaps something like:

DFtable.select([col(i).cast("long") for i in DFtable.columns])
Don't have an account?
Coming from Hortonworks? Activate your account here