Support Questions
Find answers, ask questions, and share your expertise

pyspark to hbase

New Contributor

hello,

i using this code to insert data into hbase but i get the below error ;

any help please 🙂

15278-capture-du-2017-05-10-00-46-48.png

from pyspark import SparkContext
from pyspark.sql import SQLContext
sc = SparkContext()
sqlc = SQLContext(sc)
data_source_format = "org.apache.spark.sql.execution.datasources.hbase"
df = sc.parallelize([('a', '1.0'), ('b', '2.0')]).toDF(schema=['col0', 'col1'])
# ''.join(string.split()) in order to write a multi-line JSON string here.
catalog = ''.join("""{
    "table":{"namespace":"default", "name":"zaz"},
    "rowkey":"key",
    "columns":{
        "col0":{"cf":"rowkey", "col":"key", "type":"string"},
        "col1":{"cf":"cf", "col":"col1", "type":"string"}
    }
}""".split())
host ='sandbox.hortonworks.com'
table = 'test'
conf = {"hbase.zookeeper.quorum": host,
"hbase.mapred.outputtable": table}
# Writing
df.write.options(catalog=catalog).format(data_source_format).save()
0 REPLIES 0