Support Questions

barlow · ‎08-13-2018

Hello community,

I have created the following pyspark query:

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
counts = spark.sql("""SELECT
FirstName
,LastName
,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
counts.coalesce(1).write.csv("/home/packt/Downloads/myresults3.csv")

I would like to add the current date and time to the file called myresults3.

I think the code would look something like the following:

counts.coalesce(1).write.csvCONCAT("/home/packt/Downloads/'myresults3'-CURRENTDATE.csv")

I'm sure I'm way off the mark with the above attempt, but I'm sure you can see what I'm trying to achieve.

Any help will be appreciated.

Cheers

Carlton

sandyy006 · ‎08-13-2018

@Carlton Patterson

You can use "mode("append")" to append the new data to existing one.

counts.coalesce(1).write.mode("append").csv("/home/packt/Downloads/myresults7-"+currentdate+".csv")

P.S please use 'reply' on this comment instead of writing a new comment. In this way we can maintain the conversaion in order.

View solution in original post

sandyy006 · ‎08-13-2018

@Carlton Patterson

You can use "mode("append")" to append the new data to existing one.

counts.coalesce(1).write.mode("append").csv("/home/packt/Downloads/myresults7-"+currentdate+".csv")

P.S please use 'reply' on this comment instead of writing a new comment. In this way we can maintain the conversaion in order.

barlow · ‎08-13-2018

Hi Sandeep, thanks. It works very well. Thank you

sandyy006 · ‎08-13-2018

@Carlton Patterson Glad it helped, Do click on 'Accept' on my answer and mark this thread as closed.

sandyy006 · ‎08-13-2018

@Carlton Patterson Looks like you have accepted another comment. I've made this reply as comment and this should be the correct one to accept as it helped in resolving your issue. 🙂

Cloudera Community

Support Questions

How to concatenate a date to a filename in pyspark