Community Articles
Find and share helpful community-sourced technical articles
Labels (2)
Super Guru

Running Livy on HDP 2.5

10696-livy1.png

Ingest Metrics REST API From Livy with Apache NiFi / HDF

10697-getlivyhttpstatus1.png

Use GetHTTP To Ingest The Status On Your Batch From Livy

10698-getlivybatchesstatus.png

Running Livy

The first step, we download Livy from github. To install on HDP 2.5, is simple. I found a node that wasn't too busy and put the project their.

To run, it's simple:

export SPARK_HOME=/usr/hdp/current/spark-client/ 
export HADOOP_CONF_DIR=/etc/hadoop/conf 
nohup ./livy-server &

That's it, you have a basic unprotected Livy instance running. This is important, there's no security on there. You should either put Knox in front of this or enable Livy's security.

I wanted to submit a Scala Spark Batch Job. So I wrote a quick one below to have something to call.

Source Code for Example Spark 1.6.2 Batch Application:

Step 1: GetFile

Store File /opt/demo/sparkrun.js with JSON to trigger Spark through Livy.

{"file": "/apps/Links.jar","className": "com.dataflowdeveloper.links.Links"}

Step 2: PostHTTP Make the call to Livy REST API to submit Spark job.

Step 3: PutHDFS Store results of call to Hadoop HDFS

Livy Logs

16/12/21 22:50:25 INFO LivyServer: Using spark-submit version 1.6.2
16/12/21 22:50:25 WARN RequestLogHandler: !RequestLog
16/12/21 22:50:25 INFO WebServer: Starting server on http://tspanndev11.field.hortonworks.com:8998
16/12/21 22:51:20 INFO SparkProcessBuilder: Running '/usr/hdp/current/spark-client/bin/spark-submit' '--name' 'Livy' '--class' 'com.dataflowdeveloper.links.Links' 'hdfs://hadoopserver:8020/opt/demo/links.jar' '/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json'
16/12/21 22:51:20 INFO SessionManager: Registering new session 0

Spark Compiled JAR File Must Be Deployed to HDFS and Be Readable

hdfs dfs -put Links.jar /appshdfs dfs -chmod 777 /apps/Links.jar

Checking YARN for Our Application

yarn application --list

Submitting a Scala Spark Job Normal Style

/bin/spark-submit --class "com.dataflowdeveloper.links.Links" --master yarn --deploy-mode cluster /opt/demo/Links.jar

Deploying a Scala Spark Application Built With SBT

scp target/scala-2.10/links.jar user@server:/opt/demo

Reference:

Liv REST API

To Submit to Livy from the Command Line

curl -X POST --data '{"file": "/opt/demo/links.jar","className": "com.dataflowdeveloper.links.Links","args": ["/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json"]}'  
-H "Content-Type: application/json" http://server:8998/batches

NIFI Template

livy.xml

13,048 Views
Comments
Expert Contributor

Nice! BTW, HDP 2.5 has Livy built-in. Can be found under Spark service in Ambari.

Super Guru

That Livy is only for Zeppelin, it's not safe to use that

In HDP 2.6, there will be a Livy available for general usage.

Rising Star

How to submit a python spark job with kerberos keytab and principal ?

Super Guru

Livy supports that is now a full citizen in HDP. I have not tried it, but post a question.

Super Guru

default port is 8999