We had good experience using Oryx 1 and are moving toward to use Oryx 2.
One thing we notice is the iteration convergence settings.
In Oryx 1, there is a value to control this : convergence-threshold.
However, it seems in Oryx 2, the only way to control the converge is something like max-iteration..
My guess is that SparkML does not support convergence-threshold.
Oryx 2 uses SparkML ALS. So, it does not support that.
Do you plan to support convergence-threshold in the future Oryx2 product roadmap ?
Exactly, the problem is that it is not supported by MLlib. There is a lot of advantage to using MLlib instead of making a new implementation. However this is a problem since most incremental updates should just need a few iterations.
I will look at whether it's possible to run several MLlib jobs of a few iterations each and assess convergence in Oryx. Maybe that's actually pretty easy. It will depend on whether it's easy to set the initial feature vector and I haven't looked at that in a while.
For now, yes it means you have to make a whole new model from scratch each time and have to run a fair number of iterations each time.