Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (2)
avatar
Explorer

In some application use cases, developers want to save Spark DataFrame table directly into Phoenix instead of saving into HBase as a intermediate step. In those case, we can use Apache Phoenix-Spark plugin package. The related api is very simple:

df.save("org.apache.phoenix.spark", SaveMode.Overwrite, Map("table" -> "OUTPUT_TABLE",
  "zkUrl" -> "****:2181:/****"))

However, we need to pay attention that in Apache Phoenix, all the column names by default are considered as uppercase unless you surround it with quotation marks "". Therefore, if you have specified lowercase column name in your Phoenix Schema, you have to do some column names transformation in Spark. The example code is as follows:

val oldNames = df.columns
val newNames = oldNames.map(name => col(name).as("\"" + name + "\""))
val df2 = df.select(newNames:_*)
2,859 Views
Version history
Last update:
‎09-26-2016 08:19 AM
Updated by:
Contributors