Support Questions

Find answers, ask questions, and share your expertise

required help to install spark on CDH5

avatar
Rising Star

Hi

I have installed CDH5 on my ubuntu 14.04 successfully.

but for spark only history server is running . Master and worker is not running .

is it required to manually start these services ? 

please provide help 

 

Regards

Prateek

1 ACCEPTED SOLUTION

avatar
When using Spark on YARN, there's no need for the Master or Worker roles.

View solution in original post

2 REPLIES 2

avatar
When using Spark on YARN, there's no need for the Master or Worker roles.

avatar
New Contributor

Hi,

 

Im happy to set up a separate post, but was hoping you could pick this up here. Im having a similar problem with Spark and Hue on CDM.

 

I have a running CDM 5.7.1 cluster on Ubuntu 14.04, with all services working fine (apart from spark and impala).

 

It apears that the spark hostory servers and gateways are installed, but I cant activate Spark in Standalone or Spark on Yarn. In the parcels section I am getting errors across a number of services:

 

  • Error for parcel SPARK-0.9.0-1.cdh4.6.0.p0.98-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel SOLR-1.3.0-1.cdh4.5.0.p0.9-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel IMPALA-2.1.0-1.impala2.0.0.p0.1995-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.
  • Error for parcel ACCUMULO-1.4.4-1.cdh4.5.0.p0.65-trusty : Parcel not available for OS Distribution UBUNTU_TRUSTY.

Having checked, this appears to mean that there isnt a Ubuntu Trusty version of the above parcels.

 

Can you confirm if this is the case.

 

If so, can I install the components via apt-get:

 

sudo apt-get install spark-core spark-master spark-worker spark-history-server spark-python

as described in this link for CDM 5.4.x:

 

http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_ig_spark_install.html

 

Any guidance on this would be appreciated.

 

After installing spark, I'd like to activate the Hue Spark Notebook, but can see that in Hue app_blacklist is set to:

 

app_blacklist

['spark', 'zookeeper', 'security']

 

I have removed spark and zookeeper from the app_blacklist leaving 'security' and have restarted the Hue service, and refreshed Hue web UI, I can see only 'security' in the Hue.ini dump but still dont have any spark notebook available. This may be due to the dependency on Spark Parcels being installed.

 

If I have to re-install the cluster onto Linux RedHat to activate the Spark Parcels that could be a possibility, but I'd prefer to get everything working on ubuntu 14.04 first if possible.

 

Any guidance on which route to take would be appreciated.

 

Regards

 

natdacruz