Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

required help to install spark on CDH5

avatar
Rising Star

Hi

I have installed CDH5 on my ubuntu 14.04 successfully.

but for spark only history server is running . Master and worker is not running .

is it required to manually start these services ? 

please provide help 

 

Regards

Prateek

1 ACCEPTED SOLUTION

avatar
When using Spark on YARN, there's no need for the Master or Worker roles.

View solution in original post

2 REPLIES 2

avatar
When using Spark on YARN, there's no need for the Master or Worker roles.

avatar
New Contributor

Hi,

 

Im happy to set up a separate post, but was hoping you could pick this up here. Im having a similar problem with Spark and Hue on CDM.

 

I have a running CDM 5.7.1 cluster on Ubuntu 14.04, with all services working fine (apart from spark and impala).

 

It apears that the spark hostory servers and gateways are installed, but I cant activate Spark in Standalone or Spark on Yarn. In the parcels section I am getting errors across a number of services:

 

  • Error for parcel SPARK-0.9.0-1.cdh4.6.0.p0.98-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel SOLR-1.3.0-1.cdh4.5.0.p0.9-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel IMPALA-2.1.0-1.impala2.0.0.p0.1995-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel ACCUMULO-1.4.4-1.cdh4.5.0.p0.65-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.

Having checked, this appears to mean that there isnt a Ubuntu Trusty version of the above parcels.

 

Can you confirm if this is the case.

 

If so, can I install the components via apt-get:

 

sudo apt-get install spark-core spark-master spark-worker spark-history-server spark-python

as described in this link for CDM 5.4.x:

 

http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_ig_spark_install.html

 

Any guidance on this would be appreciated.

 

After installing spark, I'd like to activate the Hue Spark Notebook, but can see that in Hue app_blacklist is set to:

 

app_blacklist

['spark', 'zookeeper', 'security']

 

I have removed spark and zookeeper from the app_blacklist leaving 'security' and have restarted the Hue service, and refreshed Hue web UI, I can see only 'security' in the Hue.ini dump but still dont have any spark notebook available. This may be due to the dependency on Spark Parcels being installed.

 

If I have to re-install the cluster onto Linux RedHat to activate the Spark Parcels that could be a possibility, but I'd prefer to get everything working on ubuntu 14.04 first if possible.

 

Any guidance on which route to take would be appreciated.

 

Regards

 

natdacruz