Support Questions

Find answers, ask questions, and share your expertise

Having Spark 1.6.0 and 2.1 in the same CDH

Super Collaborator

Hi Guys,

 

I'm planning to upgrade my CDH version to 5.10.2 and some of our Developers needs Spark 2.1 to use it in spark streaming.

 

I'm planning to manage the 2 versions using Cloudera manager, 1.6 will be intergrated one and the Spark 2.1 with parcels.

 

My questions:

 

1- Should i use the spark2 as a service? will they let me have 2 spark services, the reqular one and the spark 2.1 one?

 

2- is it preferable to istall the roles and gateways for spark on the same servers of the reqular one? i assume the history and spark server can be different servers and using different port for the history server, how it will looks like when i add 2 gateways on the same DN?

 

3- Is it compleicated to be managed?

 

4- Is there away that 2 versions conflicted and affecting the current Spark jobs?

1 ACCEPTED SOLUTION

Champion
Did you restart CM and CMS?

If not, then it will not pickup the csd file and it will not be available as a service to install.

If you have, for the cluster with the parcels distributed and activated, choose 'Add a Service' from the cluster action menu. Is it available in that list of services?

View solution in original post

29 REPLIES 29

Expert Contributor

+1 @mbigelow 

This (not seeing Spark2 in 'Add a Service' wizard) is generally a result of Cloudera Management Services not being restarted (or CM not recognizing the CSD)

How do I connect to the Hive metastore from Spark2.

 

Thx

Renjith

Champion
In the Spark2 configs, ensure that the Hive service is enabled. This will include the Hive client configs for the Spark2 service. This will allow the SparkSession created by spark2-shell to have Hive support for the HMS on the cluster.

I haven't tested actual Spark2 applications but with the above setup it should be as simple as using the .enableHiveSupport in the SparkSession builder method.

Outside of that you would probably need to include the hive-site.xml or Hive HMS settings in the Spark Context configuration object and then us .enableHiveSupport.

Explorer

Did this work for you ? I'm facing same issue, after successfully distributing and activating parcel, not able to see spark 2 service in CM.

Super Collaborator

@Hitesh88 Did you restart CM and CMS?

Explorer

Yes I restart both CM server and cloudera management service both and still it is not showing Spark2 as service in cluster.

 

I see JDK 1.7 used with CDH 5.9 on cluster I'm on. I read somewhere that Spark 2 requires JDK 1.8. Could that be stopping to get spark2 service.

 

Regards,

Hitesh

Super Collaborator
Yes, Spark2 need JDK8.

Explorer

Okay, thanks for the update. Do you think there is anything else which could be causing this issue and not displaying Spark2. 

 

Regards,

Hitesh

Explorer

Also, is there way to confirm csd file is properly deployed. Also, I don't see scala 11 libraries under /opt/cloudera/parcels/CDH/jars and only scala 10 libraries. 

 

I heard that scala 10 and 11 both are installed with CDH 5.7 and later. Shouldn't scala 11 be available, Is this also cause for spark2 service not appearing.

 

I did all steps as mentioned and all steps did completely successfully, spark2 parcel is activated now.

 

Regards,

Hitesh

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.