Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to save Dataframe to Phoenix

Highlighted

Unable to save Dataframe to Phoenix

Explorer

Hi,

 

I'm trying to write from Spark to HBase using Apache Phoenix. I have the CLABS phoenix-spark libraries and I added

/opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-spark-1.2.0.jar:/opt/cloudera/parcels/CDH-5.8.2-1.cdh5.8.2.p0.3/jars/phoenix-core-4.5.2-HBase-1.0.jar to my spark.driver.extraClassPath and spark.executor.extraClassPath.

 

The CLABS phoenix-1.2.0-client.jar causes problems with writing to Hive

 

I added

--jars /opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/lib/phoenix-spark-1.2.0.jar,/opt/cloudera/parcels/CLABS_PHOENIX-4.5.2-1.clabs_phoenix1.2.0.p0.774/lib/phoenix/phoenix-1.2.0-client.jar to my spark-submit

 

However when I try to save a dataframe to Phoenix using

df.save("org.apache.phoenix.spark", SaveMode.Overwrite,
Map("table" -> "TEST_SAVE_NEW", "zkUrl" -> "zkeeper.internal:2181"))

 I get the following error:

java.lang.NoClassDefFoundError: org/apache/phoenix/mapreduce/util/PhoenixConfigurationUtil

        at org.apache.phoenix.spark.ConfigurationUtil$.getOutputConfiguration(ConfigurationUtil.scala:33)

        at org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1.apply(DataFrameFunctions.scala:41)

        at org.apache.phoenix.spark.DataFrameFunctions$$anonfun$1.apply(DataFrameFunctions.scala:38)

        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)

        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)

        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)

        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)

        at org.apache.spark.scheduler.Task.run(Task.scala:89)

        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.ClassNotFoundException: org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil

        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

        at java.security.AccessController.doPrivileged(Native Method)

        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)

        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

        ... 14 more

 

Can you please help me?

 

Nimrod

Don't have an account?
Coming from Hortonworks? Activate your account here