Member since
07-18-2017
15
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2060 | 08-05-2018 06:56 AM |
08-05-2018
06:56 AM
Solved it. Noticed that writing to Postgresql was accurate if i read parquet with second option below. parquet("/user-data/xyz/input/TABLE/*) // WRONG numbers in PostgreSQL parquet("/user-data/xyz/input/TABLE/evnt_month=*) // Correct numbers in postgreSQL If someone is aware of such problem, please comment.
... View more
11-19-2017
07:31 PM
Its probably a Spark config issue, can you share the detail log, the information you share doesn't give enough information to identify root cause
... View more
11-15-2017
05:31 AM
@Matt Burgess , this issue was resolved by downloading HDF version of Nifi 1.2.0.. Thanks
... View more
11-06-2017
01:35 PM
@Team Spark Your TEMP_tab table having 3 columns and your insert query having 4 columns(* means 3 columns from temp_tab and substr(mytime,0,10) means extra 1 column) use the below query will work for your case FROM TEMP_TAB INSERT OVERWRITE TABLE main_TAB
PARTITION (mytime)
SELECT id,age,substr(mytime,0,10) as mytime; * *in addition in the above insert statement you are going to miss mytime column value as you are doing sub string that means source data is going to miss from temp_tab table to main_tab table. Ex:- temp_tab having 2017-10-12 12:20:23 but main_tab will have 2017-10-12, here we are going to miss 12:20:23 time from temp_tab to main _tab. In case if you dont want to miss the data then create main tab table with 4 columns in with dt as partition column CREATE TABLE IF NOT EXISTS main_TAB(id int,mytime STRING,age int)
PARTITIONED BY (dt string)
STORED AS ORC
tblproperties ("orc.compress"="ZLIB"); then do insert statement as below FROM TEMP_TAB INSERT OVERWRITE TABLE main_TAB PARTITION (mytime) SELECT *,substr(mytime,0,10) as mytime; in this case partition column would be dt and you are not missing temp_tab data at all.
... View more