Support Questions
Find answers, ask questions, and share your expertise

Pyspark interact with phoenix which is submit by oozie from hue failed with class not found exception: org.apache.spark.sql.DataFrame. Please Help!

New Contributor

Environment:

OS: CentOS 7.2 64 bit

Ambari: 2.6.2.x

HDP: 2.6.5.x

HUE: 4.1.0 (manually installed)

Ambari and Oozie Configs:

please see the picture1.

Note:

1. already indicate the spark sharelib is spark2 by the config:

oozie.action.sharelib.for.spark=spark2

2. oozie sharelib works fine by using oozie command shows:

[root@master1 python]# sudo -u oozie oozie admin --shareliblist
[Available ShareLib]
hive
spark2
distcp
backup
mapreduce-streaming
spark
oozie
hcatalog
hive2
sqoop
pig
spark_orig


[root@master1 python]# sudo -u oozie oozie admin --shareliblist spark2
[Available ShareLib]
spark2
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/JavaEWAH-0.3.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/RoaringBitmap-0.5.11.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/ST4-4.0.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/activation-1.1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aircompressor-0.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/antlr-2.7.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/antlr-runtime-3.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/antlr4-runtime-4.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aopalliance-1.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aopalliance-repackaged-2.4.0-b34.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/apache-log4j-extras-1.2.17.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/apacheds-i18n-2.0.0-M15.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/apacheds-kerberos-codec-2.0.0-M15.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/api-asn1-api-1.0.0-M20.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/api-util-1.0.0-M20.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/arpack_combined_all-0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/arrow-format-0.8.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/arrow-memory-0.8.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/arrow-vector-0.8.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/avro-1.7.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/avro-ipc-1.7.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/avro-mapred-1.7.7-hadoop2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aws-java-sdk-core-1.10.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aws-java-sdk-kms-1.10.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/aws-java-sdk-s3-1.10.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/azure-data-lake-store-sdk-2.1.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/azure-keyvault-core-0.8.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/azure-storage-5.4.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/base64-2.3.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/bcprov-jdk15on-1.58.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/bonecp-0.8.0.RELEASE.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/breeze-macros_2.11-0.13.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/breeze_2.11-0.13.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/calcite-avatica-1.2.0-incubating.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/calcite-core-1.2.0-incubating.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/calcite-linq4j-1.2.0-incubating.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/chill-java-0.8.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/chill_2.11-0.8.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-beanutils-1.7.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-beanutils-core-1.8.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-cli-1.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-codec-1.10.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-collections-3.2.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-compiler-3.0.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-compress-1.4.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-configuration-1.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-crypto-1.0.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-dbcp-1.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-digester-1.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-httpclient-3.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-io-2.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-lang-2.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-lang3-3.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-logging-1.1.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-math3-3.4.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-net-2.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/commons-pool-1.5.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/compress-lzf-1.0.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/core-1.1.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/curator-client-2.7.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/curator-framework-2.7.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/curator-recipes-2.7.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/datanucleus-api-jdo-3.2.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/datanucleus-core-3.2.10.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/datanucleus-rdbms-3.2.9.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/derby-10.12.1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/eigenbase-properties-1.1.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/flatbuffers-1.2.0-3f79e055.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/gcs-connector-1.8.1.2.6.5.0-292-shaded.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/gson-2.2.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/guava-14.0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/guice-3.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/guice-servlet-3.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-annotations-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-auth-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-aws-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-azure-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-azure-datalake-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-client-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-common-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-hdfs-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-mapreduce-client-app-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-mapreduce-client-common-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-mapreduce-client-core-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-mapreduce-client-jobclient-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-mapreduce-client-shuffle-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-openstack-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-api-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-client-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-common-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-registry-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-server-common-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hadoop-yarn-server-web-proxy-2.7.3.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hive-beeline-1.21.2.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hive-cli-1.21.2.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hive-exec-1.21.2.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hive-jdbc-1.21.2.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hive-metastore-1.21.2.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hk2-api-2.4.0-b34.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hk2-locator-2.4.0-b34.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hk2-utils-2.4.0-b34.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/hppc-0.7.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/htrace-core-3.1.0-incubating.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/httpclient-4.5.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/httpcore-4.4.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/ivy-2.4.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-annotations-2.6.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-core-2.6.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-core-asl-1.9.13.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-databind-2.6.7.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-dataformat-cbor-2.6.7.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-jaxrs-1.9.13.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-mapper-asl-1.9.13.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-module-paranamer-2.7.9.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-module-scala_2.11-2.6.7.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jackson-xc-1.9.13.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/janino-3.0.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/java-xmlbuilder-1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javassist-3.18.1-GA.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javax.annotation-api-1.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javax.inject-1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javax.inject-2.4.0-b34.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javax.servlet-api-3.1.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javax.ws.rs-api-2.0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/javolution-5.5.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jaxb-api-2.2.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jcip-annotations-1.0-1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jcl-over-slf4j-1.7.16.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jdo-api-3.0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-client-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-common-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-container-servlet-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-container-servlet-core-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-guava-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-media-jaxb-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jersey-server-2.22.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jets3t-0.9.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jetty-6.1.26.hwx.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jetty-sslengine-6.1.26.hwx.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jetty-util-6.1.26.hwx.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jline-2.12.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/joda-time-2.9.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jodd-core-3.5.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jpam-1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/json-smart-1.3.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/json4s-ast_2.11-3.2.11.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/json4s-core_2.11-3.2.11.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/json4s-jackson_2.11-3.2.11.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jsp-api-2.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jsr305-1.3.9.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jta-1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jtransforms-2.4.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/jul-to-slf4j-1.7.16.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/kryo-shaded-3.0.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/leveldbjni-all-1.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/libfb303-0.9.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/libthrift-0.9.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/log4j-1.2.17.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/lz4-java-1.4.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/machinist_2.11-0.6.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/macro-compat_2.11-1.1.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/metrics-core-3.1.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/metrics-graphite-3.1.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/metrics-json-3.1.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/metrics-jvm-3.1.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/minlog-1.3.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/netty-3.9.9.Final.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/netty-all-4.1.17.Final.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/nimbus-jose-jwt-4.41.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/objenesis-2.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/okhttp-2.7.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/okio-1.6.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/oozie-sharelib-spark-4.2.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/opencsv-2.3.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/orc-core-1.4.3.2.6.5.0-292-nohive.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/orc-mapreduce-1.4.3.2.6.5.0-292-nohive.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/oro-2.0.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/osgi-resource-locator-1.0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/paranamer-2.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-column-1.8.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-common-1.8.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-encoding-1.8.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-format-2.3.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-hadoop-1.8.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-hadoop-bundle-1.6.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/parquet-jackson-1.8.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/protobuf-java-2.5.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/py4j-0.10.6-src.zip
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/py4j-0.10.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/pyrolite-4.13.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/pyspark.zip
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scala-compiler-2.11.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scala-library-2.11.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scala-parser-combinators_2.11-1.0.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scala-reflect-2.11.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scala-xml_2.11-1.0.5.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/scalap-2.11.8.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/shapeless_2.11-2.3.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/slf4j-api-1.7.16.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/slf4j-log4j12-1.7.16.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/snappy-0.2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/snappy-java-1.1.2.6.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-catalyst_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-cloud_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-core_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-graphx_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-hadoop-cloud_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-hive-thriftserver_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-hive_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-kvstore_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-launcher_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-mllib-local_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-mllib_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-network-common_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-network-shuffle_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-repl_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-sketch_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-sql_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-streaming_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-tags_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-unsafe_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spark-yarn_2.11-2.3.0.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spire-macros_2.11-0.13.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/spire_2.11-0.13.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/stax-api-1.0-2.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/stax-api-1.0.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/stream-2.7.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/stringtemplate-3.2.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/super-csv-2.2.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/univocity-parsers-2.5.9.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/validation-api-1.1.0.Final.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/xbean-asm5-shaded-4.4.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/xercesImpl-2.9.1.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/xmlenc-0.52.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/xz-1.0.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/zookeeper-3.4.6.2.6.5.0-292.jar
 hdfs://hps/user/oozie/share/lib/lib_20180716114151/spark2/zstd-jni-1.3.2-2.jar


[root@master1 python]# 

Pyspark code(modified from pi.py):

from __future__ import print_function

import sys
from random import random
from operator import add

from pyspark.sql import SparkSession


if __name__ == "__main__":
    """
        Usage: pi [partitions]
    """
    spark = SparkSession        .builder        .appName("PythonPi")        .getOrCreate()
    partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2
    n = 100000 * partitions
    def f(_):
        x = random() * 2 - 1
        y = random() * 2 - 1
        return 1 if x ** 2 + y ** 2 <= 1 else 0
    count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
    print("Pi is roughly %f" % (4.0 * count / n))

    # test add
    countDF = spark.read.format("org.apache.phoenix.spark").option("table", "TBL_WEB_COUNT").option("zkUrl", "master1.com:2181").load()
    countDF.show()

    spark.stop()

Note: please note the code below "# test add" which implements the interaction with phoenix

HUE Job Config:

please see the picture 2 and 3.

Job.Properties:

<workflow-app name="SparkPI-py2-Workflow" xmlns="uri:oozie:workflow:0.5">
    <start to="spark-1321"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="spark-1321">
        <spark xmlns="uri:oozie:spark-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <master>yarn</master>
            <mode>cluster</mode>
            <name>SparkPI-py2-Workflow</name>
            <jar>pi-cus.py</jar>
              <spark-opts>--jars hdfs:///job/mh/phoenix-spark2.jar,hdfs:///job/mh/phoenix-client.jar,hdfs:///job/mh/postgresql-42.2.2.jar</spark-opts>
              <arg>10</arg>
            <file>/util/pi-cus.py#pi-cus.py</file>
        </spark>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

Note: add dependencies jars by using --jars

Error1:

2018-09-20 08:47:05,887 [Thread-9] INFO  org.apache.hadoop.metrics2.impl.MetricsSystemImpl  - Scheduled snapshot period at 10 second(s).
2018-09-20 08:47:05,887 [Thread-9] INFO  org.apache.hadoop.metrics2.impl.MetricsSystemImpl  - phoenix metrics system started
Traceback (most recent call last):
  File "pi-cus.py", line 48, in <module>
    countDF = spark.read.format("org.apache.phoenix.spark").option("table", "TBL_WEB_COUNT").option("zkUrl", "master1.com:2181").load()
  File "/data/data2/hadoop/yarn/local/usercache/hdfs/appcache/application_1536916788874_0832/container_e13_1536916788874_0832_01_000001/pyspark.zip/pyspark/sql/readwriter.py", line 172, in load
  File "/data/data2/hadoop/yarn/local/usercache/hdfs/appcache/application_1536916788874_0832/container_e13_1536916788874_0832_01_000001/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1160, in __call__
  File "/data/data2/hadoop/yarn/local/usercache/hdfs/appcache/application_1536916788874_0832/container_e13_1536916788874_0832_01_000001/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/data/data2/hadoop/yarn/local/usercache/hdfs/appcache/application_1536916788874_0832/container_e13_1536916788874_0832_01_000001/py4j-0.10.6-src.zip/py4j/protocol.py", line 320, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o62.load.
: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
 at java.lang.Class.getDeclaredMethods0(Native Method)
 at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
 at java.lang.Class.getDeclaredMethod(Class.java:2128)
 at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
 at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
 at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
 at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
 at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
 at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
 at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
 at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
 at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
 at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
 at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
 at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:342)
 at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335)
 at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
 at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
 at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371)
 at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370)
 at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
 at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
 at org.apache.spark.rdd.RDD.map(RDD.scala:370)
 at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:131)
 at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60)
 at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:431)
 at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
 at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
 at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
 at py4j.Gateway.invoke(Gateway.java:282)
 at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
 at py4j.commands.CallCommand.execute(CallCommand.java:79)
 at py4j.GatewayConnection.run(GatewayConnection.java:214)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 45 more

2018-09-20 08:47:07,337 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - User application exited with status 1
2018-09-20 08:47:07,340 [Driver] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Final app status: FAILED, exitCode: 1, (reason: User application exited with status 1)

Error2:

2018-09-20 09:12:41,195 [dispatcher-event-loop-3] INFO  org.apache.spark.scheduler.TaskSetManager  - Starting task 0.0 in stage 1.0 (TID 10, slave06.com, executor 2, partition 0, NODE_LOCAL, 8106 bytes)
2018-09-20 09:12:42,301 [dispatcher-event-loop-12] INFO  org.apache.spark.storage.BlockManagerInfo  - Added broadcast_4_piece0 in memory on slave06.com:35682 (size: 7.0 KB, free: 366.3 MB)
2018-09-20 09:12:42,388 [task-result-getter-2] WARN  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.0 in stage 1.0 (TID 10, slave06.com, executor 2): java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
 at java.lang.Class.getDeclaredMethods0(Native Method)
 at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
 at java.lang.Class.getDeclaredMethod(Class.java:2128)
 at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475)
 at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72)
 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498)
 at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
 at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1829)
 at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at scala.collection.immutable.List$SerializationProxy.readObject(List.scala:479)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2122)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
 at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
 at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
 at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
 at org.apache.spark.scheduler.Task.run(Task.scala:109)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame
 at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
 ... 187 more

2018-09-20 09:12:42,391 [dispatcher-event-loop-2] INFO  org.apache.spark.scheduler.TaskSetManager  - Starting task 0.1 in stage 1.0 (TID 11, slave06.com, executor 2, partition 0, NODE_LOCAL, 8106 bytes)
2018-09-20 09:12:42,414 [task-result-getter-3] INFO  org.apache.spark.scheduler.TaskSetManager  - Lost task 0.1 in stage 1.0 (TID 11) on slave06.com, executor 2: java.lang.NoClassDefFoundError (org/apache/spark/sql/DataFrame) [duplicate 1]

Note:

1. The job may success sometimes. Seems 50% fail by class not found exception and 50% success.


2. The job works fine on both cluster mode and client mode by using command (spark-submit --master yarn xxxxxx) on cluster machine.

Please help!!! Thx very much!!!

92492-picture1.png

92494-picture3.png

92493-picture2.png

1 REPLY 1

New Contributor

Thx for ur apply @Jonathan Sneep

Already installed spark2 client on all the hosts. I tried using only 1 executor to run the job to figure out whether the ClassNotFoundException is related to the host which is running the driver or the executor. But history tests show that the same job running on slave03 can be either success or fail (class not found).

Also , one interesting things. I found the job often success when it is running in the 1st time with correct configuration. But it fails when retry it. So i wonder is oozie has some cache or something?

Using the following command all works well (i tried more than 10 times) :

spark-submit --master yarn --deploy-mode cluster --jars hdfs:/job/mh/phoenix-spark2.jar,hdfs:/job/mh/phoenix-client.jar hdfs:/util/phoenix.py

So i am really doubt is oozie's problem.