<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to save dataframe as text file in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144153#M35650</link>
    <description>&lt;P&gt;How to save the data inside a dataframe to text file in csv format in HDFS?&lt;/P&gt;&lt;P&gt;Tried the following but csv doesn't see to be a supported format&lt;/P&gt;&lt;PRE&gt;df.write.format("csv").save("/filepath")&lt;/PRE&gt;</description>
    <pubDate>Sat, 23 Jul 2016 03:45:25 GMT</pubDate>
    <dc:creator>__anonymous__</dc:creator>
    <dc:date>2016-07-23T03:45:25Z</dc:date>
    <item>
      <title>How to save dataframe as text file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144153#M35650</link>
      <description>&lt;P&gt;How to save the data inside a dataframe to text file in csv format in HDFS?&lt;/P&gt;&lt;P&gt;Tried the following but csv doesn't see to be a supported format&lt;/P&gt;&lt;PRE&gt;df.write.format("csv").save("/filepath")&lt;/PRE&gt;</description>
      <pubDate>Sat, 23 Jul 2016 03:45:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144153#M35650</guid>
      <dc:creator>__anonymous__</dc:creator>
      <dc:date>2016-07-23T03:45:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to save dataframe as text file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144154#M35651</link>
      <description>&lt;P&gt;The best way to save dataframe to csv file is to use the library provide by Databrick &lt;A href="https://github.com/databricks/spark-csv"&gt;Spark-csv&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It provides support for almost all features you encounter using csv file. &lt;/P&gt;&lt;PRE&gt;spark-shell --packages com.databricks:spark-csv_2.10:1.4.0&lt;/PRE&gt;&lt;P&gt;then use the library API to save to csv files&lt;/P&gt;&lt;PRE&gt;df.write.format("com.databricks.spark.csv").option("header", "true").save("file.csv")
&lt;/PRE&gt;&lt;P&gt;It also support reading from csv file with similar API&lt;/P&gt;&lt;PRE&gt;val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("file.csv")&lt;/PRE&gt;&lt;P&gt;You could also write some custom code to create the output string using mkString, but it won't be safe if you encounter special characters and won't be able to handle quote, etc..&lt;/P&gt;&lt;PRE&gt;df.map(x =&amp;gt; x.mkString("|")).saveAsTextFile("file.csv") &lt;/PRE&gt;</description>
      <pubDate>Sat, 23 Jul 2016 03:54:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144154#M35651</guid>
      <dc:creator>qiwang</dc:creator>
      <dc:date>2016-07-23T03:54:41Z</dc:date>
    </item>
    <item>
      <title>Re: How to save dataframe as text file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144155#M35652</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3090/qiwang.html" nodeid="3090"&gt;@Qi Wang&lt;/A&gt; I think we do not have the Databrick CSV library available in the exam.&lt;/P&gt;&lt;P&gt;Your approach with mkString() works well if there is no header required in the output csv file. Can I assume that in the exam tasks?&lt;/P&gt;</description>
      <pubDate>Wed, 07 Dec 2016 02:42:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-save-dataframe-as-text-file/m-p/144155#M35652</guid>
      <dc:creator>stefan_frankenh</dc:creator>
      <dc:date>2016-12-07T02:42:06Z</dc:date>
    </item>
  </channel>
</rss>

