Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Running Spark application and Python application on CDH 6

Highlighted

Running Spark application and Python application on CDH 6

New Contributor

Hi

I have 2 scenarios and need advice if this is possible. 

 

Running Spark applications (pyspark) on CDH6:

Our Current CDH6 is enterprise cluster which has combination of batch jobs follow by analytic queries. 

According to the documentation, Cloudera CDH6 supports running Spark applications on a YARN cluster manager. 

Is it a sustainable approach running Spark applications (distributed workloads) on CDH6 without any impact to cluster (workload manager will differentiate the batch loads, user queries and spark applications)

 

Running Python applications on CDH6:

Python applications can also be deployed on CDH6 however this will be executed on single thread. I am sure this may not be a good for long run however would like to check if there would be any impact to cluster. We would be limiting the YARN resources for these applications. 

 

Seek your suggestions

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here