Support Questions

Find answers, ask questions, and share your expertise

How to save all the output of pyspark sql query into a text file or any file

avatar
Explorer

Hello community,

The output from the pyspark query below produces the following output

83554-spark.png

The pyspark query is as follows:

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
  FirstName
 ,LastName
 ,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show()

Can someone show me how to save the results to a text / csv file ( or any file please)

Thanks Carlton

1 ACCEPTED SOLUTION

avatar

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6

avatar

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar
Explorer

Felix, thank you so much. It worked like a dream

avatar
Explorer

Is there a way to get the results with the header info?

avatar
Expert Contributor

@Carlton Patterson

myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')

avatar
New Contributor

Getting permission denied error :Permission denied: user=cldraproc, access=WRITE, inode="/":yarn:supergroup:drwxr-xr-x

 

df.coalesce(1).write.format('csv').save("/home/cldraproc/shobhit/ccorp.csv", header='true');

avatar
Community Manager

@shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: