<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question rename columns of the dataframe in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36296#M15167</link>
    <description>&lt;P&gt;Hi I have a dataframe (loaded CSV) where the inferredSchema filled the column names from the file. I am trying to get rid of white spaces from column names - because otherwise the DF cannot be saved as parquet file - and did not find any usefull method for renaming.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The method withColumnRenamed("Company ID","Company_ID") works, but I need to repeat it for every column in the dataframe. I tried to to use toDF method,&lt;/P&gt;
&lt;P&gt;such as:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;val dfnew = df.toDF( df.columns.map( a =&amp;gt; a.replace(" ","_") ) );&lt;/P&gt;
&lt;P&gt;but it failed.,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any ideas?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 19 Sep 2022 18:44:10 GMT</pubDate>
    <dc:creator>Tomas79</dc:creator>
    <dc:date>2022-09-19T18:44:10Z</dc:date>
    <item>
      <title>rename columns of the dataframe</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36296#M15167</link>
      <description>&lt;P&gt;Hi I have a dataframe (loaded CSV) where the inferredSchema filled the column names from the file. I am trying to get rid of white spaces from column names - because otherwise the DF cannot be saved as parquet file - and did not find any usefull method for renaming.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The method withColumnRenamed("Company ID","Company_ID") works, but I need to repeat it for every column in the dataframe. I tried to to use toDF method,&lt;/P&gt;
&lt;P&gt;such as:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;val dfnew = df.toDF( df.columns.map( a =&amp;gt; a.replace(" ","_") ) );&lt;/P&gt;
&lt;P&gt;but it failed.,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any ideas?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 19 Sep 2022 18:44:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36296#M15167</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2022-09-19T18:44:10Z</dc:date>
    </item>
    <item>
      <title>Re: rename columns of the dataframe</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36297#M15168</link>
      <description>&lt;P&gt;I have found a solution to this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;df.registerTempTable("tmp");&lt;/P&gt;&lt;P&gt;val newdf = sqlContext.sql(""" select &amp;nbsp;'Company ID' as Company_ID, 'Product ID' as Product_ID, .. from tmp""");&lt;/P&gt;&lt;P&gt;newdf.saveAsParquetFile(...);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;T.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jan 2016 15:00:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36297#M15168</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2016-01-15T15:00:37Z</dc:date>
    </item>
    <item>
      <title>Re: rename columns of the dataframe</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36299#M15169</link>
      <description>update: the column with a whitespace in the name has to be enclosed in ``. So the correct syntax is:&lt;BR /&gt;"""select `Company ID` as Company_ID, .... """&lt;BR /&gt;</description>
      <pubDate>Fri, 15 Jan 2016 15:19:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/rename-columns-of-the-dataframe/m-p/36299#M15169</guid>
      <dc:creator>Tomas79</dc:creator>
      <dc:date>2016-01-15T15:19:12Z</dc:date>
    </item>
  </channel>
</rss>

