Hi,
I have a table with a lot of data,
I want to create a new table based on some column values from this based
which method is most efficient and cluster resources friendly
Pseudo-Code
1. single job
insert into myNewTable
select * from myOldTable
where a=xxx etc.
2. two jobs:
job1. create datafame from select statement
select * from myOldTable
where a=xxx etc. as dataframe
job2 write dataframe as new table
insert into myNewTable select from dataframe