Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

rename columns of the dataframe

avatar

Hi I have a dataframe (loaded CSV) where the inferredSchema filled the column names from the file. I am trying to get rid of white spaces from column names - because otherwise the DF cannot be saved as parquet file - and did not find any usefull method for renaming.

 

The method withColumnRenamed("Company ID","Company_ID") works, but I need to repeat it for every column in the dataframe. I tried to to use toDF method,

such as:

 

val dfnew = df.toDF( df.columns.map( a => a.replace(" ","_") ) );

but it failed.,

 

Any ideas?

 

1 ACCEPTED SOLUTION

avatar

I have found a solution to this:

 

df.registerTempTable("tmp");

val newdf = sqlContext.sql(""" select  'Company ID' as Company_ID, 'Product ID' as Product_ID, .. from tmp""");

newdf.saveAsParquetFile(...);

 

T.

 

View solution in original post

2 REPLIES 2

avatar

I have found a solution to this:

 

df.registerTempTable("tmp");

val newdf = sqlContext.sql(""" select  'Company ID' as Company_ID, 'Product ID' as Product_ID, .. from tmp""");

newdf.saveAsParquetFile(...);

 

T.

 

avatar
update: the column with a whitespace in the name has to be enclosed in ``. So the correct syntax is:
"""select `Company ID` as Company_ID, .... """