Created 07-16-2018 08:44 AM
Hi there,
in my hdp 2.6.2, i am using spark2.1.1, phoenix 4.7. When start spark-shell as below,
spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/hoenix-core-4.7.0.2.6.2.0-205.jar"
It can successfully save data to table2 with code
val phoenixOptionMap=Map("table"->"TABLE1","zkUrl"->"zk1:2181/hbase-secure") val df2=spark.sqlContext.read.format("org.apache.phoenix.spark").options(phoenixOptionMap).load() val configuration = HBaseConfiguration.create(); configuration.set("zookeeper.znode.parent", "/hbase-secure") df2.saveToPhoenix("table2",configuration,Option("zk1:2181/hbase-secure"))
Then i created a new scala program as:
package com.test import org.apache.spark.sql.{SQLContext, SparkSession} import org.apache.phoenix.spark._ import org.apache.hadoop.hbase.HBaseConfiguration object SmokeTest { def main(args: Array[String]): Unit = { val spark = SparkSession .builder() .appName("PhoenixSmokeTest") .getOrCreate() val phoenixOptionMap=Map("table"->"TABLE1","zkUrl"->"zk1:2181/hbase-secure") val df2=spark.sqlContext.read.format("org.apache.phoenix.spark").options(phoenixOptionMap).load() val configuration = HBaseConfiguration.create(); configuration.set("zookeeper.znode.parent", "/hbase-secure") configuration.addResource("/etc/hbase/conf/hbase-site.xml") df2.saveToPhoenix("table2",configuration,Option("zk1:2181/hbase-secure")) } }
and run it with below spark-submit script
spark-submit \ --class com.test.SmokeTest \ --master yarn\ --deploy-mode client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 4 \ --num-executors 2 \ --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar,/usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
It failed with below message
18/07/16 16:30:16 INFO ClientCnxn: Session establishment complete on server zk1/10.2.29.102:2181, sessionid = 0x364270588b5472f, negotiated timeout = 60000 18/07/16 16:30:17 INFO Metrics: Initializing metrics system: phoenix 18/07/16 16:30:17 INFO MetricsConfig: loaded properties from hadoop-metrics2.properties 18/07/16 16:30:17 INFO MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 18/07/16 16:30:17 INFO MetricsSystemImpl: phoenix metrics system started Exception in thread "main" java.lang.NoSuchMethodError: org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix$default$4()Lscala/Option; at com.trendyglobal.bigdata.inventory.SmokeTest$.main(SmokeTest.scala:28) at com.trendyglobal.bigdata.inventory.SmokeTest.main(SmokeTest.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 18/07/16 16:30:20 INFO SparkContext: Invoking stop() from shutdown hook 18/07/16 16:30:20 INFO ServerConnector: Stopped Spark@38f66b77{HTTP/1.1}{0.0.0.0:4040}<br>
Woud anyone has any advice?
Thanks,
Forest
Created 07-16-2018 12:16 PM
I see your executor has the full path to the phoenix client jars. From local mode to yarn/client mode the most relevant change is that the executors will run on cluster worker nodes. Please try running your code like this:
spark-submit \ --class com.test.SmokeTest \ --master yarn\ --deploy-mode client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 4 \ --num-executors 2 \ --conf "spark.executor.extraClassPath=phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar,/usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
And let me know if that works.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 07-16-2018 11:11 AM
seems it relate to https://issues.apache.org/jira/browse/PHOENIX-3333 , however, in hdp2.6.2, it is fixed according to https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/patch_phoenix.html
Created 07-16-2018 12:16 PM
I see your executor has the full path to the phoenix client jars. From local mode to yarn/client mode the most relevant change is that the executors will run on cluster worker nodes. Please try running your code like this:
spark-submit \ --class com.test.SmokeTest \ --master yarn\ --deploy-mode client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 4 \ --num-executors 2 \ --conf "spark.executor.extraClassPath=phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar,/usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
And let me know if that works.
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 07-17-2018 01:09 PM
@forest lin The above suggestion was for --deploy-mode client and I see you used --deploy-mode cluster instead. If you are willing to run in cluster mode you need to do this changes:
cp /etc/hbase/conf/hbase-site.xml /etc/spark/conf cp /etc/hbase/conf/hbase-site.xml /etc/spark2/conf export SPARK_CLASSPATH="/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" spark-submit \ --class com.test.SmokeTest \ --master yarn\ --deploy-mode client \ --driver-memory 1g \ --executor-memory 1g \ --executor-cores 4 \ --num-executors 2 \ --conf "spark.executor.extraClassPath=phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar,/usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --files /etc/hbase/conf/hbase-site.xml --verbose \ /tmp/test-1.0-SNAPSHOT.jar
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 07-17-2018 05:39 AM
Hi @Felix Albani thanks your advice, I changed the submit command as:
spark-submit \ --class com.test.SmokeTest \ --master yarn \ --deploy-mode cluster \ --driver-memory 1g \ --executor-memory 2g \ --executor-cores 4 \ --num-executors 2 \ --files /etc/hbase/conf/hbase-site.xml \ --conf "spark.executor.extraClassPath=phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.2.0-205-spark2.jar,/usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark2-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
but no luck:
18/07/17 13:16:21 INFO CodeGenerator: Code generated in 33.11763 ms 18/07/17 13:16:22 ERROR ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix$default$4()Lscala/Option; java.lang.NoSuchMethodError: org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix$default$4()Lscala/Option; at com.trendyglobal.bigdata.inventory.CreateTestData$anonfun$main$1.apply$mcVI$sp(CreateTestData.scala:87) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at com.trendyglobal.bigdata.inventory.CreateTestData$.main(CreateTestData.scala:80) at com.trendyglobal.bigdata.inventory.CreateTestData.main(CreateTestData.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$anon$3.run(ApplicationMaster.scala:654) 18/07/17 13:16:22 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoSuchMethodError: org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix$default$4()Lscala/Option;) 18/07/17 13:16:22 INFO SparkContext: Invoking stop() from shutdown hook 18/07/17 13:16:22 INFO ServerConnector: Stopped Spark@81d2265{HTTP/1.1}{0.0.0.0:0} 18/07/17 13:16:22 INFO SparkUI: Stopped Spark web UI at http://10.2.29.104:37764 18/07/17 13:16:22 INFO YarnAllocator: Driver requested a total number of 0 executor(s). 18/07/17 13:16:22 INFO YarnClusterSchedulerBackend: Shutting down all executors 18/07/17 13:16:22 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down 18/07/17 13:16:22 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false)<br>
I can see the phoenix-spark2-4.7.0.2.6.2.0-205.jar was in the classpath
=============================================================================== YARN executor launch context: env: CLASSPATH -> phoenix-4.7.0.2.6.2.0-205-spark2.jar:phoenix-client.jar:hbase-client.jar:phoenix-spark2-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/etc/hadoop/conf<CPS>/usr/hdp/current/hadoop-client/*<CPS>/usr/hdp/current/hadoop-client/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.6.2.0-205/hadoop/lib/hadoop-lzo-0.6.0.2.6.2.0-205.jar:/etc/hadoop/conf/secure SPARK_YARN_STAGING_DIR -> hdfs://nn1-dev1-tbdp.trendy-global.com:8020/user/nifi/.sparkStaging/application_1529853578712_0039 SPARK_USER -> nifi SPARK_YARN_MODE -> true command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx2048m \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://CoarseGrainedScheduler@10.2.29.104:40401 \ --executor-id \ <executorId> \ --hostname \ <hostname> \ --cores \ 4 \ --app-id \ application_1529853578712_0039 \ --user-class-path \ file:$PWD/__app__.jar \ --user-class-path \ file:$PWD/phoenix-4.7.0.2.6.2.0-205-spark2.jar \ --user-class-path \ file:$PWD/phoenix-client.jar \ --user-class-path \ file:$PWD/hbase-client.jar \ --user-class-path \ file:$PWD/phoenix-spark2-4.7.0.2.6.2.0-205.jar \ --user-class-path \ file:$PWD/hbase-common.jar \ --user-class-path \ file:$PWD/hbase-protocol.jar \ --user-class-path \ file:$PWD/phoenix-core-4.7.0.2.6.2.0-205.jar \ 1><LOG_DIR>/stdout \ 2><LOG_DIR>/stderr resources: hbase-common.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/hbase-common.jar" } size: 575685 timestamp: 1531804498373 type: FILE visibility: PRIVATE phoenix-4.7.0.2.6.2.0-205-spark2.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/phoenix-4.7.0.2.6.2.0-205-spark2.jar" } size: 87275 timestamp: 1531804497220 type: FILE visibility: PRIVATE __app__.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/inventory-calc-service-1.0-SNAPSHOT.jar" } size: 41478 timestamp: 1531804497134 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/__spark_conf__.zip" } size: 106688 timestamp: 1531804498824 type: ARCHIVE visibility: PRIVATE hbase-client.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/hbase-client.jar" } size: 1398707 timestamp: 1531804498300 type: FILE visibility: PRIVATE phoenix-spark2-4.7.0.2.6.2.0-205.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/phoenix-spark2-4.7.0.2.6.2.0-205.jar" } size: 81143 timestamp: 1531804498334 type: FILE visibility: PRIVATE hbase-site.xml -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/hbase-site.xml" } size: 7320 timestamp: 1531804498662 type: FILE visibility: PRIVATE hbase-protocol.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/hbase-protocol.jar" } size: 4941870 timestamp: 1531804498450 type: FILE visibility: PRIVATE __spark_libs__ -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/hdp/apps/2.6.2.0-205/spark2/spark2-hdp-yarn-archive.tar.gz" } size: 180384518 timestamp: 1507704288496 type: ARCHIVE visibility: PUBLIC phoenix-client.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/phoenix-client.jar" } size: 107566119 timestamp: 1531804498256 type: FILE visibility: PRIVATE phoenix-core-4.7.0.2.6.2.0-205.jar -> resource { scheme: "hdfs" host: "nn1-dev1-tbdp.trendy-global.com" port: 8020 file: "/user/nifi/.sparkStaging/application_1529853578712_0039/phoenix-core-4.7.0.2.6.2.0-205.jar" } size: 3834414 timestamp: 1531804498628 type: FILE visibility: PRIVATE ===============================================================================
Created 07-18-2018 12:02 PM
spark-submit \ --class com.test.SmokeTest \ --master yarn \ --deploy-mode cluster \ --driver-memory 1g \ --executor-memory 2g \ --executor-cores 2 \ --num-executors 3 \ --files /etc/hbase/conf/hbase-site.xml \ --conf "spark.executor.extraClassPath=phoenix-client.jar:hbase-client.jar:phoenix-spark-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/lib/hbase-client.jar:/usr/hdp/current/phoenix-client/lib/phoenix-spark-4.7.0.2.6.2.0-205.jar:/usr/hdp/current/phoenix-client/lib/hbase-common.jar:/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
Following your advice, set the classpath and copy the said xml, but still have error :
18/07/18 19:47:59 INFO Client: client token: Token { kind: YARN_CLIENT_TOKEN, service: } diagnostics: User class threw exception: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame ApplicationMaster host: 10.2.29.104 ApplicationMaster RPC port: 0 queue: default start time: 1531914415906 final status: FAILED tracking URL: http://en1-dev1-tbdp.trendy-global.com:8088/proxy/application_1531814517578_0019/ user: nifi Exception in thread "main" org.apache.spark.SparkException: Application application_1531814517578_0019 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1261) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:751) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 18/07/18 19:47:59 INFO ShutdownHookManager: Shutdown hook called
Created 07-18-2018 01:19 PM
@forest lin spark.driver.extraClassPath is not same as the one I shared for cluster mode. Could you confirm the code is running in client mode? And then try the exact settings I provided for cluster mode? Please let me know how it goes!
Created 07-20-2018 04:23 AM
Hi Felix,
I followed your guideline and change the command as following
spark-submit \ --class com.test.SmokeTest \ --master yarn \ --deploy-mode cluster \ --driver-memory 1g \ --executor-memory 2g \ --executor-cores 2 \ --num-executors 3 \ --files /etc/hbase/conf/hbase-site.xml \ --conf "spark.executor.extraClassPath=phoenix-client.jar:hbase-client.jar:phoenix-spark-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --conf "spark.driver.extraClassPath=phoenix-client.jar:hbase-client.jar:phoenix-spark-4.7.0.2.6.2.0-205.jar:hbase-common.jar:hbase-protocol.jar:phoenix-core-4.7.0.2.6.2.0-205.jar" \ --jars /usr/hdp/current/phoenix-client/phoenix-client.jar,/usr/hdp/current/phoenix-client/lib/hbase-client.jar,/usr/hdp/current/phoenix-client/lib/phoenix-spark-4.7.0.2.6.2.0-205.jar,/usr/hdp/current/phoenix-client/lib/hbase-common.jar,/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar,/usr/hdp/current/phoenix-client/lib/phoenix-core-4.7.0.2.6.2.0-205.jar \ --verbose \ /tmp/test-1.0-SNAPSHOT.jar
which encouter the same error:
Userclass threw exception: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
But it run successfully with spark 1.6.3
Created 07-20-2018 06:42 PM
@forest lin then that is possibly a different issue. Initially you were getting
java.lang.NoSuchMethodError: org.apache.phoenix.spark.DataFrameFunctions.saveToPhoenix$default$4()Lscala/Option;
and only for spark 2 you are now getting
Userclass threw exception: java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
Please review the following link
Also I think is best to take this error in separate thread as is not same as the initial problem which got solved by adding the configuration I mentioned before.
HTH