Created 10-19-2016 10:10 AM
I am trying move a table from Oracle to HDFS. I used QueryDatabaseTable -> putHdfs processors and configured them. I can see data coming to hdfs.. but the process is running continuously and records are being added again and again. Am I doing anything wrong or missing something?
Created 10-19-2016 12:25 PM
How did you schedule QueryDatabaseTable?
If you didn't change anything on the scheduling tab of the processor, then the run schedule is 0 seconds which means as fast as possible. You most likely want to run this on some kind of timer or cron scheduling.
Created 10-19-2016 12:25 PM
How did you schedule QueryDatabaseTable?
If you didn't change anything on the scheduling tab of the processor, then the run schedule is 0 seconds which means as fast as possible. You most likely want to run this on some kind of timer or cron scheduling.
Created 10-19-2016 01:41 PM
I did not change anything on scheduling tab as I wanted it to run asap. The problem is that the processor is running continuously. So If I have 10 records in actual table, I see the count on hdfs keep on increasing (dumping these 10 again and again). I expect it to stop after moving just these initial 10 records onto hdfs.
Created 10-19-2016 01:55 PM
If you use timer scheduling it will still execute right away, so if you set 30 seconds it will run right away then wait 30 seconds before running again.
Can you provide all of the configuration you entered for the processor?
You would need to provide the "Maximum-Value Columns" in order for it to track where it left off and pick up there on next execution.
Created 10-20-2016 11:13 AM
Thanks you for the information. So, if I set "Run Schedule" to default i.e ) sec, It will run tasks one after other again and again. And If I want it to execute only once, I would make it to some huge value or use event or CRON timer.
Created 10-20-2016 12:52 PM
Correct. If you truly only want to run it once, then make the timer schedule larger and just manually start and stop the processor.
Created 03-04-2017 02:14 PM
Hi Bryan,
Small question regarding your earlier reply.
In my scenario, the processor should run once when i started and it should run again when there is any update. I'm not sure that when the Database will get updated. So, i can't use Timer here. I have tried by specifying the "Maximun-value-columns", but no luck.
Can you please help me in finding the way to do this.
Thanks,
Srikanth.
Created 03-06-2017 03:42 PM
What does your table look like? Is there a column that is guaranteed to be "strictly increasing" for each added/updated row? Sometimes this is the ID column (if using an autoincrementing integer that doesn't roll over), or perhaps a timestamp column such as "Last Updated". If you have no such column, then you will want to follow Bryan's advice on scheduling and start/stop.
Created 03-05-2017 05:47 PM
Thanks. Its worked after giving the schedule time.