I am new to spark ml pipelines , so wondering if you could give me some pointers for the following issue
I have in python customer Estimator and Model code.
pipeline = Pipeline(stages=[MyEsimator()])pipelinemodel = pipeline.fit(train_df)results = pipelinemodel.transform(test_df)
It works fine, but I have to train 50 models in parallel. so, what is the best way to run pipeline.fit() and transform()
in parallel ? Is there any support in spark ml for it ?