Community Articles
Find and share helpful community-sourced technical articles
Labels (2)
New Contributor

In some application use cases, developers want to save Spark DataFrame table directly into Phoenix instead of saving into HBase as a intermediate step. In those case, we can use Apache Phoenix-Spark plugin package. The related api is very simple:

df.save("org.apache.phoenix.spark", SaveMode.Overwrite, Map("table" -> "OUTPUT_TABLE",
  "zkUrl" -> "****:2181:/****"))

However, we need to pay attention that in Apache Phoenix, all the column names by default are considered as uppercase unless you surround it with quotation marks "". Therefore, if you have specified lowercase column name in your Phoenix Schema, you have to do some column names transformation in Spark. The example code is as follows:

val oldNames = df.columns
val newNames = oldNames.map(name => col(name).as("\"" + name + "\""))
val df2 = df.select(newNames:_*)
2,426 Views
Don't have an account?
Version history
Revision #:
1 of 1
Last update:
‎09-26-2016 08:19 AM
Updated by:
 
Contributors
Top Kudoed Authors