Reply
Highlighted
Explorer
Posts: 8
Registered: ‎10-06-2016

Unable to save Dataframe to Phoenix

Hi,

 

I'm trying to write from Spark to HBase using Apache Phoenix. I have the CLABS phoenix-spark libraries and I added

/opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-spark-1.2.0.jar:/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/jars/phoenix-core-4.5.2-HBase-1.0.jar to my spark.driver.extraClassPath and spark.executor.extraClassPath.

 

The CLABS phoenix-1.2.0-client.jar causes problems with writing to Hive

 

I added

--jars /opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-spark-1.2.0.jar,/opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/phoenix-1.2.0-client.jar to my spark-submit

 

However when I try to save a dataframe to Phoenix using

df.save("org.apache.phoenix.spark", SaveMode.Overwrite,
Map("table" -> "TEST_SAVE_NEW", "zkUrl" -> "zkeeper.internal:2181"))

 I get the following error:

java.lang.NoClassDefFoundError: org/apache/phoenix/mapreduce/util/PhoenixConfigurationUtil

        at org.apache.phoenix.spark.ConfigurationUtil$.getOutputConfiguration(ConfigurationUtil.scala:33)

        at org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1.apply(DataFrameFunctions.scala:41)

        at org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1.apply(DataFrameFunctions.scala:38)

        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)

        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)

        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

        at org.apache.spark.scheduler.Task.run(Task.scala:89)

        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.ClassNotFoundException: org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil

        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

        at java.security.AccessController.doPrivileged(Native Method)

        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

        ... 14 more

 

Can you please help me?

 

Nimrod

Announcements