Created on 09-24-2017 01:35 PM - edited 08-17-2019 11:05 AM
Spark Load testing framework built on a number of distributed technologies, including Gatling, Livy, Akka, and HDP. Using Akka Server powered by LIVY {Spark as a Service} provides the following benefits.
Livy is an open source REST interface for interacting with Apache Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in Apache Hadoop YARN.
Livy offers three modes to run Spark jobs:
Livy provides the following features:
Livy provides the following advantages:
Gatling is a highly capable load testing tool. It is designed for ease of use, maintainability and high performance. Gatling server provides the following benefits.
Gatling’s architecture is asynchronous as long as the underlying protocol, such as HTTP, can be implemented in a non blocking way. This kind of architecture lets us implement virtual users as messages instead of dedicated threads, making them very resource cheap. Thus, running thousands of concurrent virtual users is not an issue.
val theScenarioBuilder = scenario("Interactive Spark Command Scenario Using LIVY Rest Services $sessionId").exec( /* myRequest1 is a name that describes the request. */ http("Interactive Spark Command Simulation") .get("/insrun?sessionId=${sessionId}&statement=sparkSession.sql(%22%20select%20event.site_id%20from%20siteexposure_event%20as%20event%20where%20st_intersects(st_makeBBOX(${bbox})%2C%20geom)%20limit%205%20%22).show").check() ).pause(4 second)
So, this is great, we can load test our spark interactive command with one user! Let’s increase the number of users.
To increase the number of simulated users, all you have to do is to change the configuration of the simulation as follows:
setUp( theScenarioBuilder.inject(atOnceUsers(10)) ).protocols(theHttpProtocolBuilder)
If you want to simulate 3000 users, you might not want them to start at the same time. Indeed, real users are more likely to connect to your web application gradually.
Gatling provides rampUsers to implement this behavior. The value of the ramp indicates the duration over which the users will be linearly started. In our scenario let’s have 10 regular users ramp them over 10 seconds so we don’t hammer the Livy server:
setUp( theScenarioBuilder.inject(rampUsers(10) over (10 seconds)), ).protocols(theHttpProtocolBuilder)