Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

Loading json format data to hbase using pyspark

Highlighted

Loading json format data to hbase using pyspark

New Contributor

Hi,

I have a use case where i need to load json data to hbase using pyspark with row key and 3 column families,Can anyone please help me how to do this.

Below is the json i want to load.

 

{ "ticid": "1496", "ticlocation": "vizag", "custnum": "222", "Comments": { "comment": [{ "commentno": "1", "desc": "journey", "passengerseat": { "intele": "09" }, "passengerloc": { "intele": "s15" } }, { "commentno": "5", "desc": " food", "passengerseat": { "intele": "09" }, "passengerloc": { "intele": "s15" } }, { "commentno": "12", "desc": " service", "passengerseat": { "intele": "09" }, "passengerloc": { "intele": "s15" } }] }, "Rails": { "Rail": [{ "Traino": "AP1545", "startcity": "vizag", "passengerseat": "5" }, { "Traino": "AP1555", "startcity": "HYD", "passengerseat": "15A" }] } }

 

ticid is the row key

ticlocation ,custnum  need to be in column family 1

Comments needs to be column family 2

Rails needs to be column family 3