Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to save all the output of pyspark sql query into a text file or any file

Solved Go to solution
Highlighted

How to save all the output of pyspark sql query into a text file or any file

Explorer

Hello community,

The output from the pyspark query below produces the following output

83554-spark.png

The pyspark query is as follows:

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
  FirstName
 ,LastName
 ,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show()

Can someone show me how to save the results to a text / csv file ( or any file please)

Thanks Carlton

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

6 REPLIES 6
Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

@Carlton Patterson

You can use this to write whole dataframe to single file:

myresults.coalesce(1).write.csv("/tmp/myresults.csv")

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

View solution in original post

Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

Explorer

Felix, thank you so much. It worked like a dream

Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

Explorer

Is there a way to get the results with the header info?

Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

Expert Contributor

@Carlton Patterson

myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')
Highlighted

Re: How to save all the output of pyspark sql query into a text file or any file

New Contributor

Getting permission denied error :Permission denied: user=cldraproc, access=WRITE, inode="/":yarn:supergroup:drwxr-xr-x

 

df.coalesce(1).write.format('csv').save("/home/cldraproc/shobhit/ccorp.csv", header='true');

Re: How to save all the output of pyspark sql query into a text file or any file

Community Manager

@shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 


Vidya Sargur, Community Manager

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Don't have an account?
Coming from Hortonworks? Activate your account here