Support Questions

barlow · ‎08-06-2018

Hello community,

The output from the pyspark query below produces the following output

The pyspark query is as follows:

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
  FirstName
 ,LastName
 ,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show()

Can someone show me how to save the results to a text / csv file ( or any file please)

Thanks Carlton

falbani · ‎08-06-2018

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

falbani · ‎08-06-2018

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

barlow · ‎08-06-2018

Felix, thank you so much. It worked like a dream

barlow · ‎08-06-2018

Is there a way to get the results with the header info?

mark_hadoop · ‎10-02-2018

@Carlton Patterson

myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')

shubh · ‎06-12-2020

Getting permission denied error :Permission denied: user=cldraproc, access=WRITE, inode="/":yarn:supergroup:drwxr-xr-x

df.coalesce(1).write.format('csv').save("/home/cldraproc/shobhit/ccorp.csv", header='true');

VidyaSargur · ‎06-15-2020

@shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.

Regards,

Vidya Sargur,
Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

Support Questions

How to save all the output of pyspark sql query into a text file or any file