Community Articles

vshukla · ‎05-21-2016

The key thing to remember is that in Spark RDD/DF are immutable. So once created you can not change them.

However there are many situation where you want the column type to be different.

E.g By default Spark comes with cars.csv where year column is a String. If you want to use a datetime function you need the column as a Datetime. You can change the column type from string to date in a new dataframe.

Here is an example to change the column type.

val df2 = sqlContext.load("com.databricks.spark.csv", Map("path" -> "file:///Users/vshukla/projects/spark/sql/core/src/test/resources/cars.csv", "header" -> "true"))
df2.printSchema()
val df4 = df2.withColumn("year2", 'year.cast("Int")).select('year2 as 'year, 'make, 'model, 'comment)
df4.printSchema

senthilkumarP · ‎11-27-2016

Hi,

Its not working for me ..Unable to cast the column data type.Here is my code.I am getting only null values.Thanks!

val df4 = businessDF.withColumn("Estimated Gross Loss1", $"Estimated Gross Loss".cast("Int")).select($"Estimated Gross Loss1" as "Estimated Gross Loss")

srinibandaru_bi · ‎05-15-2017

Nice article. I am looking for a solution, Instead of specifying a each column separately. is there any way we can dynamically handle the datatype. Lets say in my dataframe i have 50 columns out of 8 are decimals and need to convert all decimal datatype to double. Without specify a column name can we do that directly?

HiveLoadSpark · ‎09-30-2019

I am also looking for something like this. I need to convert all my date datatypes to varchar in a dataframe having more than 300 columns.

Any suggestions??

Cloudera Community

Community Articles

How to change column Type in SparkSQL?

Apache Spark

Apache Zeppelin

Re: How to change column Type in SparkSQL?

Re: How to change column Type in SparkSQL?

Re: How to change column Type in SparkSQL?