Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark from lab to production ?

Highlighted

Spark from lab to production ?

New Contributor

Hi,

I m using spark cluster (6 nodes 128 GB RAM 1 TB hard disk) in standalone mode for predicting the customer behavior in an organisation. The model creation process for an organisation takes 20 minutes and it takes place for every 30 days for an organisation. Any ideas or views how to implement this process on a scale of 50,000 organisation. Whether to use spark or any other means as spark does not support multiple model creation for multiple organisations simultaneously.

Thanks in Advance

4 REPLIES 4

Re: Spark from lab to production ?

Rising Star

First of all, are you using all the resources of your cluster? ie. is your Spark application using all the resources? If so, are they really used by your Spark process? If no, you can scale horizontally, launching the m,odel creatin for multiple organization in the same time... If the resources are not enough for you, you can always scale your cluster size...

Re: Spark from lab to production ?

New Contributor

Model creation for an organisation runs as a single application in the spark cluster. How can run multiple application simultaneously in the same spark context in the spark cluster? In spark cluster, I have allotted all the resource to the spark context but I don't have control over how the executors are used for the spark application as I am running in standalone mode. Some executors are free at some time.Can this be used or I have to change the configuration ?

Re: Spark from lab to production ?

Rising Star

You can run multiple spark applications simultaneously if your cluster has enough resources. If you are not exploiting all the resources you have allocated you should just reduce the allocated resources. In this way you can run multiple applications.

Re: Spark from lab to production ?

New Contributor

So for only one application runs at a time in the spark context and the other application starts after the completion of previous one. How to create multiple spark application in the one spark context in the standalone mode? By reducing the resource allocated to spark context can I create new spark context in the master through the same driver. If so how to do it? What configuration changes I have to make to the spark context?