Support Questions

zoro07500 · ‎03-02-2017

Hi guys i am trying to save a dataframe to a csv file , that contains a timestamp. The problem that this column changes of format one written in the csv file .when showing via df.show i got a correct format

when i check the csv file i got this format

i also tried some think like this ,and still got the same problem

finalresult.coalesce(1).write.option("header",true).option("inferSchema","true").option("dateFormat","yyyy-MM-dd HH:mm:ss").csv("C:/mydata.csv")

val spark =SparkSession.builder.master("local").appName("my-spark-app").getOrCreate()val df = spark.read.option("header",true).option("inferSchema","true").csv("C:/Users/mhattabi/Desktop/dataTest2.csv")//val df = spark.read.option("header",true).option("inferSchema", "true").csv("C:\dataSet.csv\datasetTest.csv")//convert all column to numeric value in order to apply aggregation function 
    df.columns.map { c  =>df.withColumn(c, col(c).cast("int"))}//add a new column inluding the new timestamp columnval result2=df.withColumn("new_time",((unix_timestamp(col("time"))/300).cast("long")*300).cast("timestamp")).drop("time")val finalresult=result2.groupBy("new_time").agg(result2.drop("new_time").columns.map((_ ->"mean")).toMap).sort("new_time")//agg(avg(all columns..)   finalresult.coalesce(1).write.option("header",true).option("inferSchema","true").csv("C:/mydata.csv")

adnanalvee · ‎03-02-2017

A quick hack would be to use scala "substring"

http://alvinalexander.com/scala/scala-string-examples-collection-cheat-sheet

So what you can do is write a UDF and run the "new_time" column through it and grab upto time stamp you want. For example if you want just "yyyy-MM-dd HH:MM" as seen when you run the "df.show", your sub string code will be

new_time.substring(0,15)

which will yield "2015-12-06 12:40"

pseudo code

def getDateTimeSplit = udf((new_time:String) => {
    val s = new_time.substring(0,15)
    return s
})

Cloudera Community

Support Questions

timestamp column changes of format in a csv file spark

Change timestamp format field in nifi

Converting a Large JSON File into CSV

convert unix timestamp to timestamp format

How can I change specific column title in CSV file...

Change timestamp format micro to millis on NiFi

Import HBase data in csv format using pig

Specify Schema for CSV files with no header and pe...

Apache Nifi: substract hours from column value wi...

Nifi || Mail || Display csv files content in tabul...

Spark to support REGEX column specification for Hi...