Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark History Server: How to install/configure?

avatar
Contributor

I understand Spark History Server is an independent module (not related to YARN's Job History Server).

I have deployed CDH 5.4 (via parcels) but Spark History Server is not there!!

<Q1> How do I install Spark History Server? via parcels or via RPMs?

<Q2> Any special configuration for deploying Spark History Server?

<Q3> what port Spark History Server is running on?

<Q4> So far I have deployed 1 Spark Master (Master Web UI), several Spark Workers.

          What other 'services' could be deployed?

          For instane, YARN has: ResourceManager WE UI, HistoryServer Web UI, Dynamic Resource Pools.

 

 

1 ACCEPTED SOLUTION

avatar
Contributor

Basically, I have to instantiate these steps via a CP API Python script:

To add the History Server:
1.Go to the Spark service.
2.Click the Instances tab.
3.Click the Add Role Instances button.
4.Select a host in the column under History Server, then click OK.
5.Click Continue.
6.Check the checkbox next to the History Server role.
7.Select Actions for Selected > Start and click Start.
8.Click Close when the action completes.

 

View solution in original post

5 REPLIES 5

avatar
Master Collaborator

The History Server is part of the "Spark" service and is one of the roles you deploy through it. You don't have to configure it specially, but you can, including what port it's on. Normally you would not run a Spark master or worker at all, but just use YARN; I'd advise that. There are not other Spark services besides these 3.

avatar
Contributor

Actually, here is what I have deployed/confgured for Spark:

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

scm=> select * from services where service_id = 24;
 service_id | optimistic_lock_version |  name  | service_type | cluster_id | maintenance_count | display_name | generation
------------+-------------------------+--------+--------------+------------+-------------------+--------------+------------
         24 |                      34 | spark0 | SPARK        |         25 |                 0 | spark0       |          1
(1 row)

 

scm=> select role_type, configured_status, host_id from roles where service_id = 24;
  role_type   | configured_status | host_id
--------------+-------------------+---------
 SPARK_WORKER | RUNNING           |       1
 GATEWAY      | NA                |       4
 GATEWAY      | NA                |       5
 GATEWAY      | NA                |       6
 GATEWAY      | NA                |       3
 GATEWAY      | NA                |       1
 GATEWAY      | NA                |       2
 SPARK_WORKER | RUNNING           |       2
 SPARK_WORKER | RUNNING           |       6
 SPARK_WORKER | RUNNING           |       8
 SPARK_WORKER | RUNNING           |       5
 SPARK_WORKER | RUNNING           |       7
 SPARK_WORKER | RUNNING           |       3
 SPARK_MASTER | RUNNING           |       4
(14 rows)

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

 

That tells me that the 'Spark History Server' role is not installed.

Do I have to install it, and if so how??

 

Thank you!

 

avatar
Master Collaborator

Yes you add this role to a server just like with any other service/role in CM. Look at the Spark service. Spark Gateway is a "role" but not a server process, FWIW. Just means spark-submit et al can be run on that machine.

avatar
Contributor

I have been using CM API python scripts for adding Hadoop services into a CDH cluster.

I would like to add the Spark History Server role by calling a script.

Could you please provide me with some samples/links/docs to create it.

 

Thank you!

 

avatar
Contributor

Basically, I have to instantiate these steps via a CP API Python script:

To add the History Server:
1.Go to the Spark service.
2.Click the Instances tab.
3.Click the Add Role Instances button.
4.Select a host in the column under History Server, then click OK.
5.Click Continue.
6.Check the checkbox next to the History Server role.
7.Select Actions for Selected > Start and click Start.
8.Click Close when the action completes.