Created on 08-06-2018 11:32 AM - edited 08-17-2019 09:58 PM
Hello community,
The output from the pyspark query below produces the following output
The pyspark query is as follows:
#%% import findspark findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7') from pyspark.sql import SparkSession spark = SparkSession.builder.appName('ops').getOrCreate() df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True) df.createOrReplaceTempView('HumanResources_vEmployeeDepartment') myresults = spark.sql("""SELECT FirstName ,LastName ,JobTitle FROM HumanResources_vEmployeeDepartment ORDER BY FirstName, LastName DESC""") myresults.show()
Can someone show me how to save the results to a text / csv file ( or any file please)
Thanks Carlton
Created 08-06-2018 01:21 PM
You can use this to write whole dataframe to single file:
myresults.coalesce(1).write.csv("/tmp/myresults.csv")
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 08-06-2018 01:21 PM
You can use this to write whole dataframe to single file:
myresults.coalesce(1).write.csv("/tmp/myresults.csv")
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 08-06-2018 08:56 PM
Felix, thank you so much. It worked like a dream
Created 08-06-2018 09:02 PM
Is there a way to get the results with the header info?
Created 10-02-2018 11:40 AM
myresults.coalesce(1).write.format('csv').save("/tmp/myresults.csv", header='true')