Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to install only ambari yarn and spark ?

Solved Go to solution

Is it possible to install only ambari yarn and spark ?

New Contributor

Hi,

To avoid the standalone mode of Spark and use ambari to monitor my spark jobs, I was wondering if I could setup a HDP cluster with only ambari + spark + yarn without other components (or as little as possible) to avoid having too many nodes for just profiting of ambari/spark integration through yarn.

Thanks,

Nicolas

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Is it possible to install only ambari yarn and spark ?

Contributor

@Nicolas Steinmetz I just tested your usecase in my environment and below are the components that would be needed before you move forward:

1. HDFS

2. YARN

3 Zookeeper

4. MR

5. Hive

6. Pig Client - You can remove this after the installation is done

7. Slider client - You can remove this after the installation is done

8. Tez Client

9. It will give you a Warning for SmartSense and Ambari Metrics but you can by pass that.

10 . Spark

Note - I tested this with HDP 2.5 and Ambari 2.4.0.1

Please find the attached screenshot for reference.

untitled.pnguntitled-1.pnguntitled-2.pnguntitled-3.pnguntitled-4.png

6 REPLIES 6

Re: Is it possible to install only ambari yarn and spark ?

Contributor

@Nicolas Steinmetz

I believe you would need HDFS, MR and Zookeeper in addition to Yarn and Spark. Ambari will not let you move forward without these components

Re: Is it possible to install only ambari yarn and spark ?

New Contributor

At the bare minimum, you will need the cluster to have the following components: HDFS (data storage), MR (processing), Zookeeper (distributed coordination), YARN (Resource Manager), Ambari (components deployment and monitoring) and then Spark for your processing. Ambari will not proceed to deploy without these components.

Re: Is it possible to install only ambari yarn and spark ?

Contributor

@Nicolas Steinmetz I just tested your usecase in my environment and below are the components that would be needed before you move forward:

1. HDFS

2. YARN

3 Zookeeper

4. MR

5. Hive

6. Pig Client - You can remove this after the installation is done

7. Slider client - You can remove this after the installation is done

8. Tez Client

9. It will give you a Warning for SmartSense and Ambari Metrics but you can by pass that.

10 . Spark

Note - I tested this with HDP 2.5 and Ambari 2.4.0.1

Please find the attached screenshot for reference.

untitled.pnguntitled-1.pnguntitled-2.pnguntitled-3.pnguntitled-4.png

Re: Is it possible to install only ambari yarn and spark ?

New Contributor

Hi @lraheja

Thanks for your precised answer (and thanks other people too :) )

Side questions, it does not enforce having too many machines ? I would like to have a minimum sized cluster for this need.

Thanks,

Nicolas

Re: Is it possible to install only ambari yarn and spark ?

Contributor
@Nicolas Steinmetz

It would depend on your need. If dfs.replication is 3(default) - which means each block would be replicated to 3 Data Nodes then you would atleast need 3 machines and all should have Data Node on it. You can configure this value of in HDFS and you would need to have atleast those many machine. Usually one go for 5 node cluster - 1 Master Node, 3 Data Nodes and 1 Edge Node (All clients on it).

If your replication factor is 2 then you can build up a cluster with 2 Node too.

Re: Is it possible to install only ambari yarn and spark ?

New Contributor

Hi @lraheja

Thanks for your precision ; I'll share this with other people in the team and see if we take this option or not.

Thanks a lot

Nicolas

Don't have an account?
Coming from Hortonworks? Activate your account here