Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
New Contributor

In some application use cases, developers want to save Spark DataFrame table directly into Phoenix instead of saving into HBase as a intermediate step. In those case, we can use Apache Phoenix-Spark plugin package. The related api is very simple:

df.save("org.apache.phoenix.spark", SaveMode.Overwrite, Map("table" -> "OUTPUT_TABLE",
  "zkUrl" -> "****:2181:/****"))

However, we need to pay attention that in Apache Phoenix, all the column names by default are considered as uppercase unless you surround it with quotation marks "". Therefore, if you have specified lowercase column name in your Phoenix Schema, you have to do some column names transformation in Spark. The example code is as follows:

val oldNames = df.columns
val newNames = oldNames.map(name => col(name).as("\"" + name + "\""))
val df2 = df.select(newNames:_*)
2,336 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎09-26-2016 08:19 AM
Updated by:
 
Contributors
Top Kudoed Authors