Reply
New Contributor
Posts: 1
Registered: ‎10-01-2018

Sqoop Throwing Invalid ColumnName for the Derived Column when used in split-by

[ Edited ]

Hello All,

 

I am trying to import data into HDFS using Free Form Query. Since my data is could not be split on any of the available columns due its redundancy. So i have used ROW_NUMBER() to give unique values to the records.

But when i try to use it in query, am facing error for some situation and for other it is working. I know there has to be some sort of tweaking, i request anyone to help me on this! Giving the mimic scenario below.

 

Scenario:

Working Situation:

"select * from (select row_number() over (order by column)1 as rn, column1, column2 from table1) base" --split-by rn

 

Failing Situation: ( I dont want "rn" to be populated)

"select column1,column2 from (select row_number() over (order by column)1 as rn, column1, column2 from table1) base" --split-by rn

 

P.S: I dont want the "rn" column to be populated in the HDFS file, because i have a downstream consumption process which would throw error. Any help would be appreciated.

Master
Posts: 305
Registered: ‎07-01-2015

Re: Sqoop Throwing Invalid ColumnName for the Derived Column when used in split-by

Invalid column name is a syntax error, raised probably by your DB engine. You have to be specific and paste the query and logs, otherwise its very hard to help
Announcements