Support Questions

ranjan_raut18 · ‎03-17-2018

sandyy006 · ‎03-17-2018

@Ranjan Raut

Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3.

1) Create a symlink of hbase-site.xml in spark2 conf

ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml

2) Launch spark-shell using phoenix spark jars in extra classpath.

spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar"

3) Create a phoenix connection and query the tables.

scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext

scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3

scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181"))
df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field]

scala> df.show()
+-----+----------+----+
|   ID|      COL1|COL2|
+-----+----------+----+
|test1|test_row_1|  10|
|test2|test_row_2|  20|
+-----+----------+----+

View solution in original post

sandyy006 · ‎03-17-2018

@Ranjan Raut

Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3.

1) Create a symlink of hbase-site.xml in spark2 conf

ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml

2) Launch spark-shell using phoenix spark jars in extra classpath.

spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar"

3) Create a phoenix connection and query the tables.

scala> import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.SQLContext

scala> val sqlContext = new SQLContext(sc)
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3

scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181"))
df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field]

scala> df.show()
+-----+----------+----+
|   ID|      COL1|COL2|
+-----+----------+----+
|test1|test_row_1|  10|
|test2|test_row_2|  20|
+-----+----------+----+

ranjan_raut18 · ‎03-17-2018

Thanks alot @Sandeep Nemuri. It worked (y)

sandyy006 · ‎03-17-2018

@Ranjan Raut Glad that it helped you, Would you mind accepting this answer so that this thread will be marked as answered.

MilesYao · ‎03-20-2018

Note that phoenix-spark2.jar MUST precede phoenix-client.jar in extraClassPath, otherwise connection will fail with:

java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

ranjan_raut18 · ‎03-27-2018

How can I save this "df" to other table of Phoenix ?

ranjan_raut18 · ‎05-31-2018

I found answer by myself ...

use df.saveToPhoenix(Map("table" -> "OUTPUT_TABLE", "zkUrl" -> hbaseConnectionString))

xpelive · ‎07-16-2018

any idea for this issue: https://community.hortonworks.com/questions/202521/spark-submit-nosuchmethoderror-savetophoenix.html

Cloudera Community

Support Questions

How to connect Spark 2.2.0 with Phoenix 4.7 in HDP 2.6.3 ?