Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to save all the output of pyspark sql query into a text file or any file

avatar
Explorer

Hello community,

The output from the pyspark query below produces the following output

83554-spark.png

The pyspark query is as follows:

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
  FirstName
 ,LastName
 ,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show()

Can someone show me how to save the results to a text / csv file ( or any file please)

Thanks Carlton

1 ACCEPTED SOLUTION

avatar

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6

avatar

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
Explorer

Felix, thank you so much. It worked like a dream

avatar
Explorer

Is there a way to get the results with the header info?

avatar
Expert Contributor

@Carlton Patterson

myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')

avatar
New Contributor

Getting permission denied error :Permission denied: user=cldraproc, access=WRITE, inode="/":yarn:supergroup:drwxr-xr-x

 

df.coalesce(1).write.format('csv').save("/home/cldraproc/shobhit/ccorp.csv", header='true');

avatar
Community Manager

@shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: