Created 07-27-2018 04:33 PM
We have created spark application for client reporting . Client wants report in CSV format. We have coded it that way and it is generating desired output with requested format. When we see result data in log it shows correct format and correct data(I.e date format requested is 2018-07-26 11:19:04.0 and it is correct format shows in log but when we see same data in CSV file format is getting changed. It shows 6/7/2018 12:27 format. Why this issue with csv file when we see log file it shows correct results and same we have written to csv file through flie write command, it shows fomat changed. How to resolve this?
Sample Code:
val selectedData = dataFrame3.select(concat(col("ticket_number"),lit("-"),date_format(col("as_of_date"),"yyMMdd")).as("transref"),col("newmCanc").as("newmCanc"),
when(col("trade_action") === "CXL",concat(col("master_ticket_num"),lit("-"),date_format(col("as_of_date"),"yyMMdd"))).otherwise("").as("relTransref"), col("trader_name").as("portfolioIdAm"), col("portfolioIdKvg"),
col("name").as("portfolioName"), when(col("buy_sell_desc") === "Buy", "BUY").when(col("buy_sell_desc") === "Sell", "SELL").otherwise("OTHER").as("buyisell"),
col("trade_feed_trade_amount").as("quantity"),
col("secIdType"), col("id_isin").as("secId"), when(col("instrument_name").isNotNull,col("instrument_name")).otherwise(col("security_name")).as("secName"),
format_number(col("trade_price").cast("Double"),2).as("price"), col("currency").as("tradeCCY"),
format_number(col("settlement_costs_in_settlement_currency").cast("Double"),2).as("tradeComm"), format_number(col("Transaction_Cost_2_Amount").cast("Double"),2).as("fees"),
format_number(col("Transaction_Cost_3_Amount").cast("Double"),2).as("tax"),
format_number(col("Transaction_Cost_5_Amount").cast("Double"),2).as("others"), format_number(col("Accrued_Interest"),2).as("interest"),
format_number(col("settlement_total_in_settlement_currency").cast("Double"),2).as("settlAmount"),
when(col("Number_of_days_accrued_interest").isNull, "0").otherwise(col("Number_of_days_accrued_interest")).as("interestDays"),
date_format(col("as_of_date"),"yyyy-MM-dd").cast("String").as("tradeDate"),
date_format(col("receiveddate"),"yyyy-MM-dd HH:mm:ss").cast("String").as("executionTimestamp"),
date_format(col("settlement_date"), "yyyy-MM-dd").cast("String").as("settlementDate")
Created 07-30-2018 10:28 AM
can somebody please help? I am completely stuck
Created 07-30-2018 01:51 PM
@HDave I see you are casting dates as string so by looking at this code is hard to say why this is happening.
In order to help you could you post simplified version of the code that reproduces the problem? Including which HDP version you are running? This way we will understand not only how dataframe was populated but also how you are saving it.