10-01-2018 03:26 AM - last edited on 10-05-2018 08:50 AM by cjervis
I am trying to import data into HDFS using Free Form Query. Since my data is could not be split on any of the available columns due its redundancy. So i have used ROW_NUMBER() to give unique values to the records.
But when i try to use it in query, am facing error for some situation and for other it is working. I know there has to be some sort of tweaking, i request anyone to help me on this! Giving the mimic scenario below.
"select * from (select row_number() over (order by column)1 as rn, column1, column2 from table1) base" --split-by rn
Failing Situation: ( I dont want "rn" to be populated)
"select column1,column2 from (select row_number() over (order by column)1 as rn, column1, column2 from table1) base" --split-by rn
P.S: I dont want the "rn" column to be populated in the HDFS file, because i have a downstream consumption process which would throw error. Any help would be appreciated.
10-01-2018 08:06 AM