Member since
09-12-2020
29
Posts
0
Kudos Received
0
Solutions
05-15-2021
10:51 AM
How to test/ create the Livy interactive sessions
The following session is an example of how we can create a Livy session and print out the Spark version:
Create a session with the following command: curl -X POST --data '{"kind": "spark"}' -H "Content-Type: application/json" http://172.25.41.3:8998/sessions
Wait for the application to spawn, replace the session ID: curl -X POST --data '{"kind": "spark","code":"sc.version"}' -H "Content-Type: application/json" http://172.25.41.3:8998/sessions/25/statements
Replace the session ID and get the result: curl -X GET --data '{"kind": "spark"}' http://172.25.41.3:8998/sessions/25/statements
*Livy objects properties for interactive sessions
How to test the Batch Applications Using the Livy API
Following is the SparkPi test job submitted through Livy API:
To submit the SparkPi job using Livy, you should upload the required jar files to HDFS before running the job. This is the main difference between the Livy API and spark-submit.
>>
curl -H "Content-Type: application/json" http://172.25.xx.xx:8998/batches -X POST --data ' { "className": "org.apache.spark.examples.SparkPi", "conf": {"spark.executor.memory": "1g"}, "args": [10], "file": "/user/hdfs/spark-examples_2.11-2.4.0.7.1.4.0-203.jar"}'
Batch session APIs operate on batch objects, defined as follows:
How to pass job-specific options in POST batches, like we pass to Spark jobs through Spark REPL
curl --negotiate -u:$USER ${LIVY_SERVER}:${LIVY_PORT}/batches -X POST -H 'Content-Type: application/json' -d '{
"file": "hdfs:///user/livy/depend-jars/example.jar",
"proxyUser": “sandip”,
"className": "SparkWordCount",
"queue": "default",
"name": "SparkWordCount",
"jars":["hdfs:///user/livy/depend-jars/hbase-client.jar","hdfs:///user/livy/depend-jars/hbase-common.jar"],
"files":["hdfs:///user/livy/depend-files/hbase-site.xml","hdfs:///user/livy/depend-files/hive-site.xml"],
"conf": {
"spark.driver.memory": "1g",
"spark.yarn.driver.memoryOverhead": "256",
"spark.executor.instances": "2",
"spark.executor.memory": "1g",
"spark.yarn.executor.memoryOverhead": "256",
"spark.executor.cores": "1",
"spark.memory.fraction": "0.2"
},
"args":["10"]
}'
Here are the references to pass configurations.
Batch Request:
https://github.com/cloudera/livy/blob/master/server/src/main/scala/com/cloudera/livy/server/batch/CreateBatchRequest.scala
Interactive Request:
https://github.com/cloudera/livy/blob/master/server/src/main/scala/com/cloudera/livy/server/interactive/CreateInteractiveRequest.scala
References:
Running an interactive session with the Livy API
Submitting batch applications using the Livy API
https://livy.apache.org/
... View more
05-15-2021
07:59 AM
Please verify you have the same krb5.conf file on CDSW master and worker nodes as on other HDP nodes. Verify CDSW master host can communicate with KDC server, also check the nslookup result (forward/reverse) from the master host for KDC server. Verify if you have the same REALM (domain) settings in krb5.conf, also check if the CDSW nodes are part of domain_realm. Try changing the hostname for the KDC server to IP address in krb5.conf.
... View more