About Deepan_N

Deepan_N · ‎10-22-2025

Hello David, I believe HWC is still supported on CDP 7.3.1. We can see the documentation for HWC in CDP 7.3.1. -> https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/integrating-hive-and-bi/topics/hive_hivewarehouseconnector_for_handling_apache_spark_data.html And the HWC libraries are also available in the CDH parcels. ls -ltr /opt/cloudera/parcels/CDH/lib/hwc_for_spark3/ total 8 drwxr-xr-x 6 root root 87 Dec 4 2024 SparklyrHWC drwxr-xr-x 2 root root 31 Dec 4 2024 conf lrwxrwxrwx 1 root root 40 Dec 4 2024 pyspark_hwc-spark3.zip -> pyspark_hwc-spark3-1.0.0.7.3.1.0-197.zip -rw-r--r-- 1 root root 6116 Dec 4 2024 pyspark_hwc-spark3-1.0.0.7.3.1.0-197.zip lrwxrwxrwx 1 root root 62 Dec 4 2024 hive-warehouse-connector-spark3-assembly.jar -> hive-warehouse-connector-spark3-assembly-1.0.0.7.3.1.0-197.jar lrwxrwxrwx 1 root root 73 Dec 4 2024 hive-warehouse-connector-spark3-assembly-1.0.0.7.3.1.0-197.jar -> ../../jars/hive-warehouse-connector-spark3-assembly-1.0.0.7.3.1.0-197.jar Please follow the below documentation to configure the HWC with CDP. -> https://docs.cloudera.com/cdp-private-cloud-base/7.3.1/integrating-hive-and-bi/topics/hive-hwc-reader-mode.html

Deepan_N · ‎09-21-2022

Hello @Boron I believe you are using HDP 3.x. Note that there is no Spark 1.x available in HDP 3. We need to use Spark 2.x. Set the SPARK_HOME to Spark 2. export SPARK_HOME=/usr/hdp/current/spark2-client

Deepan_N · ‎08-19-2022

Hello @yagoaparecidoti , Can you please share the output of the below commands? From the Python shell: r0.headers["www-authenticate"] From the bash: # kinit # klist # date # curl -v -u: --negotiate -X GET http://<LIVY_NODE>:<PORT>/batches/

Deepan_N · ‎08-17-2022

We can also try running the below python code: 1. Run the kinit command. 2. Run the code in Python shell: import json, pprint, requests, textwrap from requests_kerberos import HTTPKerberosAuth host='http://localhost:8998' headers = {'Requested-By': 'livy','Content-Type': 'application/json','X-Requested-By': 'livy'} auth=HTTPKerberosAuth() data={'className': 'org.apache.spark.examples.SparkPi','jars': ["/tmp/spark-examples_2.11-2.4.7.7.1.7.1000-141.jar"],'name': 'livy-test1', 'file': 'hdfs:///tmp/spark-examples_2.11-2.4.7.7.1.7.1000-141.jar','args': ["10"]} r0 = requests.post(host + '/batches', data=json.dumps(data), headers=headers, auth=auth) r0.json()

Deepan_N · ‎08-17-2022

Hello @yagoaparecidoti : - Did we get the valid kerberos ticket before running the code? - Does the klist command show the valid expiry date? - Check the output of the below code after running your code: r0.headers["www-authenticate"] Are we able to run the Sample Livy job using the CURL command? Steps to run the sample job: 1. Copy the JAR to HDFS: # hdfs dfs -put /opt/cloudera/parcels/CDH/jars/spark-examples<VERSION>.jar /tmp 2. Make sure the JAR is present. # hdfs dfs -ls /tmp/ 3. CURL command to run the Spark job using Livy API. # curl -v -u: --negotiate -X POST --data '{"className": "org.apache.spark.examples.SparkPi", "jars": ["/tmp/spark-examples<VERSION>.jar"], "name": "livy-test", "file": "hdfs:///tmp/spark-examples<VERSION>.jar", "args": [10]}' -H "Content-Type: application/json" -H "X-Requested-By: User" http://<LIVY_NODE>:<PORT>/batches 4. Check for the running and completed Livy sessions. # curl http://<LIVY_NODE>:<PORT>/batches/ | python -m json.tool NOTE:         * Change the JAR version ( <VERSION> ) according your CDP version.         * Replace the LIVY_NODE and PORT with the actual values.         * If you are running the cluster in secure mode, then make sure you have a valid Kerberos ticket and use the Kerberos authentication in curl command.

Deepan_N · ‎02-02-2022

Hello @loridigia, I have tried to run the sample job like below and I see only one executor and one Driver container. # cd /usr/hdp/current/spark2-client # su spark $ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 2 examples/jars/spark-examples*.jar 10000 Even the spark-shell is also limiting the containers to one when we are mentioning the --num-executors 1 $ spark-shell --num-executors 1 - What is the spark-submit command you are trying to run? - Are you seeing the same issue with the above sample job?

Deepan_N · ‎01-06-2022

- Do we have enabled SSL for the Spark UI? - If not, are we able to access the URL with the HTTP protocol? Also try with different browser and in Private Browsing mode. - Have we tried accessing the URL with the CURL command? - Do we have SPNEGO enabled for the cluster?

Online	Offline
Last Visited	‎04-16-2026 06:09 AM

Member Since	‎03-06-2018 11:28 AM
Last Visited	‎04-16-2026 06:09 AM
Posts	225
Kudos received	1

Cloudera Community

Re: HWC on CDP 7.3.1 with Spark 3.5

Re: Cannot get pyspark to work (Creating Spark Con...

Re: how to connect to livy with kerberos authentic...

Re: how to connect to livy with kerberos authentic...

Re: how to connect to livy with kerberos authentic...

Re: Spark max number of executor to 1

Re: Unable to connect to Spark History Server Web ...