Created 02-17-2016 04:00 PM
Hello,
I am recently tasked to work out something that can read data from HBase into a Spark DataFrame and also once the transformation / enrichment is done write the DataFrame back into HBase.
What is the best way of doing this? I can see from Cloudera there is sparkOnHBase package (but I think they have given the code to HBase, and the maven modules are with version 0.0.x-clabs-SNAPSHOT which doesnt sound assuring..). There is also a HBase-Spark module on apache HBase but it seems that it is not released yet.
Ideally it would be something similar to these:
// using spark-csv from databricks DataFrame csvDF = sqlContext.read() .format("csv") .options(options) .load(hdfs.getURI("hdfs://sandbox:8020")); // using spark-solr from lucidworks DataFrame solrDF = sqlContext.read() .format("solr") .options(options) .load();
Is there something similar to these in the HBase world?
I have also seen this thread with the experimental connector but I would really prefer something more mature.
Thanks in advance!
Created 02-17-2016 09:32 PM
Hi @David Tam, for a working example using phoenix-spark to read/write HBase DataFrames, checkout https://github.com/randerzander/HiveToPhoenix
Created 02-17-2016 04:12 PM
right now the only definite answer is https://phoenix.apache.org/phoenix_spark.html
HBase-Spark is not released yet and it's coming very soon, no timeline was announced yet.
Created 02-17-2016 04:23 PM
Created 02-17-2016 04:25 PM
@David Tam Amazing to see all the jira on the same topic https://issues.apache.org/jira/browse/HBASE-14181
Created 02-17-2016 09:32 PM
Hi @David Tam, for a working example using phoenix-spark to read/write HBase DataFrames, checkout https://github.com/randerzander/HiveToPhoenix
Created 02-18-2016 08:28 AM
Thanks all for the input. The phoenix-spark example looks very close to what we need but I am not sure if people in my team would be happy with phoenix but I will bring this up and see. Meanwhile I think I will also follow the HBase jira and hope that it will be out soon.
Thank you!