Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to upgrade to Phoenix 4.10?

Highlighted

How to upgrade to Phoenix 4.10?

Contributor

I want to use Phoenix Apache Spark Plugin ( https://phoenix.apache.org/phoenix_spark.html) with Hortonworks Sandbox (Version: HDP_2.6_vmware_19_04_2017_20_25_43_hdp_ambari_2_5_0_5_1), and I found a very information link https://community.hortonworks.com/questions/1942/spark-to-phoenix.html to experiment it. I have no problems with JDBC; however, I have problems with "Load as DataFrame" and the error message is "java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame". After I Google it, I found that I should upgrade to Phoenix 4.10 to work with Spark 2.x, the link http://apache-phoenix-user-list.1124778.n5.nabble.com/Phoenix-4-9-0-with-Spark-2-0-td3602.html. So my question is how to upgrade the sandbox to Phoenix 4.10?

Below are my steps:

CREATE TABLE and UPSERT INTO in Phoenix
cd /usr/hdp/current/phoenix-client/bin 
./sqlline.py sandbox.hortonworks.com 
CREATE TABLE TABLE1 (ID BIGINT NOT NULL PRIMARY KEY, COL1 VARCHAR); 
UPSERT INTO TABLE1 (ID, COL1) VALUES (1, 'test_row_1'); 
UPSERT INTO TABLE1 (ID, COL1) VALUES (2, 'test_row_2');
Start spark-shell
spark-shell \ 
--master yarn-client \ 
--jars \ /usr/hdp/2.6.0.3-8/phoenix/phoenix-4.7.0.2.6.0.3-8-client.jar, \ 
/usr/hdp/2.6.0.3-8/phoenix/lib/phoenix-spark-4.7.0.2.6.0.3-8.jar \ 
--conf \ 
"spark.executor.extraClassPath=/usr/hdp/2.6.0.3-8/phoenix/phoenix-4.7.0.2.6.0.3-8-client.jar:/usr/hdp/2.6.0.3-8/phoenix/phoenix-4.7.0.2.6.0.3-8-client.jar"
Inside spark-shell
import org.apache.phoenix.spark._ 

val df = spark.sqlContext.load("org.apache.phoenix.spark", Map("table" -> "TABLE1", "zkUrl" -> "sandbox.hortonworks.com:2181:/hbase-unsecure") )
Error Message
warning: there was one deprecation warning; re-run with -deprecation for details java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.getDeclaredMethod(Class.java:2128) at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1475) at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:72) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:498) at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) at java.security.AccessController.doPrivileged(Native Method) at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472) at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134) at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295) at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$clean(ClosureCleaner.scala:288) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) at org.apache.spark.SparkContext.clean(SparkContext.scala:2101) at org.apache.spark.rdd.RDD$anonfun$map$1.apply(RDD.scala:370) at org.apache.spark.rdd.RDD$anonfun$map$1.apply(RDD.scala:369) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) at org.apache.spark.rdd.RDD.map(RDD.scala:369) at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:119) at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:59) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:40) at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:389) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:965) ... 53 elided Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
10 REPLIES 10
Highlighted

Re: How to upgrade to Phoenix 4.10?

Contributor

Same question about accessing Phoenix tables from full HDP 2.6 Spark 2 SQL.

Highlighted

Re: How to upgrade to Phoenix 4.10?

Explorer

This would be really helpful to be able to use spark 2 with phoenix. Any ETA on when HDP will upgrade Phoenix?

Re: How to upgrade to Phoenix 4.10?

Contributor

If you are just testing out on sandbox, this should really help: https://superuser.blog/upgrading-apache-phoenix-hdp/

We did it on prod.

Highlighted

Re: How to upgrade to Phoenix 4.10?

Expert Contributor

Hi , While trying to access the URL https://superuser.blog/upgrading-apache-phoenix-hdp/ its not opening at all

Highlighted

Re: How to upgrade to Phoenix 4.10?

Contributor

I just checked and it is opening, can you try again? @Krishna Srinivas

Highlighted

Re: How to upgrade to Phoenix 4.10?

Explorer

Hdp 2.6.2 already supports the latest phoenix stuff. It says 4.7 but it's a custom patched version. I just wasn't using the hdp phoenix driver.

That being said, it's very confusing for hdp to not just stay in sync with the naming conventions and versioning of the open source version. I'm assuming they forked it but time will only make it harder to sync back up.

Highlighted

Re: How to upgrade to Phoenix 4.10?

Contributor

We tried this too on our HDP 2.6.3 cluster. Sure enough, we got the same issue:

/usr/hdp/current/spark2-client/bin/spark-shell --master yarn-client --driver-memory 3g --executor-memory 3g --num-executors 2 --executor-cores 2 --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/phoenix-spark2.jar:/etc/hbase/conf" --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/phoenix-spark2.jar:/etc/hbase/conf" 
scala> val jobsDF = spark.read.format("org.apache.phoenix.spark").options(Map(
     |       "table" -> "ns.Jobs", "zkUrl" -> zkUrl)).load
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,file:/usr/hdp/2.6.3.0-235/phoenix/phoenix-4.7.0.2.6.3.0-235-client.jar!/ivysettings.xml will be used
2018-01-30 16:24:33,254 INFO  [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x79bb14d8 connecting to ZooKeeper ensemble=zkhost1:2181,zkhost2:2181,zkhost3:2181
java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
  at java.lang.Class.getDeclaredMethods0(Native Method)
  at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
  at java.lang.Class.getDeclaredMethod(Class.java:2128)
  at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1575)
...
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
  ... 49 elided
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 83 more

Tweaking extraClassPath and --jars using phoenix-client.jar, phoenix-4.7.0.2.6.3.0-235-spark2.jar, and spark-sql_2.11-2.2.0.2.6.3.0-235.jar made no difference. I am inclined to agree with this other fellow that Hortonworks' phoenix-client.jar is not actually Spark2-compatible, release note to the contrary.

Highlighted

Re: How to upgrade to Phoenix 4.10?

Contributor

... forgot the Stack Overflow link

Highlighted

Re: How to upgrade to Phoenix 4.10?

New Contributor

We are seeing same issue with spark2 of NoClassDefFound using /usr/hdp/current/phoenix-client/phoenix-client.jar:/usr/hdp/current/phoenix-client/phoenix-spark2.jar, release notes says it includes patch for PHOENIX-3333 but it doesn't look like that is the case.

Don't have an account?
Coming from Hortonworks? Activate your account here