Created 03-17-2018 02:56 AM
Created 03-17-2018 05:54 AM
Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3.
1) Create a symlink of hbase-site.xml in spark2 conf
ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml
2) Launch spark-shell using phoenix spark jars in extra classpath.
spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar"
3) Create a phoenix connection and query the tables.
scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> val sqlContext = new SQLContext(sc) sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3 scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181")) df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field] scala> df.show() +-----+----------+----+ | ID| COL1|COL2| +-----+----------+----+ |test1|test_row_1| 10| |test2|test_row_2| 20| +-----+----------+----+
Created 03-17-2018 05:54 AM
Below are the steps to connect Spark 2.2 with phoenix in HDP 2.6.3.
1) Create a symlink of hbase-site.xml in spark2 conf
ln -s /etc/hbase/conf/hbase-site.xml /etc/spark2/conf/hbase-site.xml
2) Launch spark-shell using phoenix spark jars in extra classpath.
spark-shell --conf "spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar" --conf "spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.7.0.2.6.3.0-235-spark2.jar:/usr/hdp/current/phoenix-client/phoenix-client.jar"
3) Create a phoenix connection and query the tables.
scala> import org.apache.spark.sql.SQLContext import org.apache.spark.sql.SQLContext scala> val sqlContext = new SQLContext(sc) sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@495e8a3 scala> val df = sqlContext.load("org.apache.phoenix.spark",Map("table" -> "TABLE1", "zkUrl" -> "localhost:2181")) df: org.apache.spark.sql.DataFrame = [ID: string, COL1: string ... 1 more field] scala> df.show() +-----+----------+----+ | ID| COL1|COL2| +-----+----------+----+ |test1|test_row_1| 10| |test2|test_row_2| 20| +-----+----------+----+
Created 03-17-2018 07:00 AM
Thanks alot @Sandeep Nemuri. It worked (y)
Created 03-17-2018 07:29 AM
@Ranjan Raut Glad that it helped you, Would you mind accepting this answer so that this thread will be marked as answered.
Created 03-20-2018 07:48 PM
Note that phoenix-spark2.jar MUST precede phoenix-client.jar in extraClassPath, otherwise connection will fail with:
java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
Created 03-27-2018 05:48 PM
How can I save this "df" to other table of Phoenix ?
Created 05-31-2018 02:26 PM
I found answer by myself ...
use df.saveToPhoenix(Map("table" -> "OUTPUT_TABLE", "zkUrl" -> hbaseConnectionString))
Created 07-16-2018 11:28 AM
any idea for this issue: https://community.hortonworks.com/questions/202521/spark-submit-nosuchmethoderror-savetophoenix.html