Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to save all the output of pyspark sql query into a text file or any file

Explorer

Hello community,

The output from the pyspark query below produces the following output

83554-spark.png

The pyspark query is as follows:

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
  FirstName
 ,LastName
 ,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show()

Can someone show me how to save the results to a text / csv file ( or any file please)

Thanks Carlton

1 ACCEPTED SOLUTION

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Explorer

Felix, thank you so much. It worked like a dream

Explorer

Is there a way to get the results with the header info?

Expert Contributor

@Carlton Patterson

myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')

New Contributor

Getting permission denied error :Permission denied: user=cldraproc, access=WRITE, inode="/":yarn:supergroup:drwxr-xr-x

 

df.coalesce(1).write.format('csv').save("/home/cldraproc/shobhit/ccorp.csv", header='true');

Community Manager

@shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.