Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Having Spark 1.6.0 and 2.1 in the same CDH

avatar
Master Collaborator

Hi Guys,

 

I'm planning to upgrade my CDH version to 5.10.2 and some of our Developers needs Spark 2.1 to use it in spark streaming.

 

I'm planning to manage the 2 versions using Cloudera manager, 1.6 will be intergrated one and the Spark 2.1 with parcels.

 

My questions:

 

1- Should i use the spark2 as a service? will they let me have 2 spark services, the reqular one and the spark 2.1 one?

 

2- is it preferable to istall the roles and gateways for spark on the same servers of the reqular one? i assume the history and spark server can be different servers and using different port for the history server, how it will looks like when i add 2 gateways on the same DN?

 

3- Is it compleicated to be managed?

 

4- Is there away that 2 versions conflicted and affecting the current Spark jobs?

1 ACCEPTED SOLUTION

avatar
Champion
Did you restart CM and CMS?

If not, then it will not pickup the csd file and it will not be available as a service to install.

If you have, for the cluster with the parcels distributed and activated, choose 'Add a Service' from the cluster action menu. Is it available in that list of services?

View solution in original post

29 REPLIES 29

avatar
Master Collaborator
i already restarted them as the process of adding the spark2 procedure.

I will give it another restart and check

avatar
Master Collaborator

+1 @mbigelow 

This (not seeing Spark2 in 'Add a Service' wizard) is generally a result of Cloudera Management Services not being restarted (or CM not recognizing the CSD)

avatar

How do I connect to the Hive metastore from Spark2.

 

Thx

Renjith

avatar
Champion
In the Spark2 configs, ensure that the Hive service is enabled. This will include the Hive client configs for the Spark2 service. This will allow the SparkSession created by spark2-shell to have Hive support for the HMS on the cluster.

I haven't tested actual Spark2 applications but with the above setup it should be as simple as using the .enableHiveSupport in the SparkSession builder method.

Outside of that you would probably need to include the hive-site.xml or Hive HMS settings in the Spark Context configuration object and then us .enableHiveSupport.

avatar
Explorer

Did this work for you ? I'm facing same issue, after successfully distributing and activating parcel, not able to see spark 2 service in CM.

avatar
Master Collaborator

@Hitesh88 Did you restart CM and CMS?

avatar
Explorer

Yes I restart both CM server and cloudera management service both and still it is not showing Spark2 as service in cluster.

 

I see JDK 1.7 used with CDH 5.9 on cluster I'm on. I read somewhere that Spark 2 requires JDK 1.8. Could that be stopping to get spark2 service.

 

Regards,

Hitesh

avatar
Master Collaborator
Yes, Spark2 need JDK8.

avatar
Explorer

Okay, thanks for the update. Do you think there is anything else which could be causing this issue and not displaying Spark2. 

 

Regards,

Hitesh

avatar
Explorer

Also, is there way to confirm csd file is properly deployed. Also, I don't see scala 11 libraries under /opt/cloudera/parcels/CDH/jars and only scala 10 libraries. 

 

I heard that scala 10 and 11 both are installed with CDH 5.7 and later. Shouldn't scala 11 be available, Is this also cause for spark2 service not appearing.

 

I did all steps as mentioned and all steps did completely successfully, spark2 parcel is activated now.

 

Regards,

Hitesh