Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. running in parallel

Highlighted running in parallel

Hello friends,


I am new to spark ml pipelines , so wondering if you could give me some pointers for the following issue


I have in python customer Estimator and Model code.



pipeline = Pipeline(stages=[MyEsimator()])
pipelinemodel =
results = pipelinemodel.transform(test_df)


It works fine, but I have to train 50 models in parallel.  so, what is the best way to run and transform()

in parallel ? Is there any support in spark ml for it ?