Created 07-31-2018 08:14 PM
Hello
I have the Hive Rawdata External Table and need populate swap data from select * from rawdata.
The process is execute by crontab daily, and i want migrate the process to Nifi.
Whats best pratice for this question?
Thanks.
Created 07-31-2018 11:41 PM
If you want to move data between two hive tables then you don't need to use SelectHiveQL processor at all.
You create hive statement like below
insert into <db_name>.<final_table> select * from <db_name>.<rawdata>
Then execute the above statement using PutHiveQL processor.
To incrementally run this process then you need to store the state i.e. until what time you have already processed the data from rawdata table. Then only select the new data after the state value.
Please refer to this and this link for more details how to incrementally copy data in hive.
Created 07-31-2018 11:41 PM
If you want to move data between two hive tables then you don't need to use SelectHiveQL processor at all.
You create hive statement like below
insert into <db_name>.<final_table> select * from <db_name>.<rawdata>
Then execute the above statement using PutHiveQL processor.
To incrementally run this process then you need to store the state i.e. until what time you have already processed the data from rawdata table. Then only select the new data after the state value.
Please refer to this and this link for more details how to incrementally copy data in hive.
Created 08-01-2018 02:24 PM
Created 08-02-2018 05:49 PM
Created 08-02-2018 10:14 PM
if you want to execute sequentially then you can use Success of PutHiveQL processor to trigger another job(i.e. start table B).
Flow:
1.GenerateFlowfile //start with tableA
2.PutHiveQL
3.ReplaceText //to prepare tableB statement 4.PutHiveQL