Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
Super Guru

Running Livy on HDP 2.5

10696-livy1.png

Ingest Metrics REST API From Livy with Apache NiFi / HDF

10697-getlivyhttpstatus1.png

Use GetHTTP To Ingest The Status On Your Batch From Livy

10698-getlivybatchesstatus.png

Running Livy

The first step, we download Livy from github. To install on HDP 2.5, is simple. I found a node that wasn't too busy and put the project their.

To run, it's simple:

export SPARK_HOME=/usr/hdp/current/spark-client/ 
export HADOOP_CONF_DIR=/etc/hadoop/conf 
nohup ./livy-server &

That's it, you have a basic unprotected Livy instance running. This is important, there's no security on there. You should either put Knox in front of this or enable Livy's security.

I wanted to submit a Scala Spark Batch Job. So I wrote a quick one below to have something to call.

Source Code for Example Spark 1.6.2 Batch Application:

Step 1: GetFile

Store File /opt/demo/sparkrun.js with JSON to trigger Spark through Livy.

{"file": "/apps/Links.jar","className": "com.dataflowdeveloper.links.Links"}

Step 2: PostHTTP Make the call to Livy REST API to submit Spark job.

Step 3: PutHDFS Store results of call to Hadoop HDFS

Livy Logs

16/12/21 22:50:25 INFO LivyServer: Using spark-submit version 1.6.2
16/12/21 22:50:25 WARN RequestLogHandler: !RequestLog
16/12/21 22:50:25 INFO WebServer: Starting server on http://tspanndev11.field.hortonworks.com:8998
16/12/21 22:51:20 INFO SparkProcessBuilder: Running '/usr/hdp/current/spark-client/bin/spark-submit' '--name' 'Livy' '--class' 'com.dataflowdeveloper.links.Links' 'hdfs://hadoopserver:8020/opt/demo/links.jar' '/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json'
16/12/21 22:51:20 INFO SessionManager: Registering new session 0

Spark Compiled JAR File Must Be Deployed to HDFS and Be Readable

hdfs dfs -put Links.jar /appshdfs dfs -chmod 777 /apps/Links.jar

Checking YARN for Our Application

yarn application --list

Submitting a Scala Spark Job Normal Style

/bin/spark-submit --class "com.dataflowdeveloper.links.Links" --master yarn --deploy-mode cluster /opt/demo/Links.jar

Deploying a Scala Spark Application Built With SBT

scp target/scala-2.10/links.jar user@server:/opt/demo

Reference:

Liv REST API

To Submit to Livy from the Command Line

curl -X POST --data '{"file": "/opt/demo/links.jar","className": "com.dataflowdeveloper.links.Links","args": ["/linkprocessor/379875e9-5d99-4f88-82b1-fda7cdd7bc98.json"]}'  
-H "Content-Type: application/json" http://server:8998/batches

NIFI Template

livy.xml

6,915 Views
Comments
Expert Contributor

Nice! BTW, HDP 2.5 has Livy built-in. Can be found under Spark service in Ambari.

Super Guru

That Livy is only for Zeppelin, it's not safe to use that

In HDP 2.6, there will be a Livy available for general usage.

Rising Star

How to submit a python spark job with kerberos keytab and principal ?

Super Guru

Livy supports that is now a full citizen in HDP. I have not tried it, but post a question.

Super Guru

default port is 8999

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 06:52 AM
Updated by:
 
Contributors
Top Kudoed Authors