Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Converting a Json file to CSV in Pig

Converting a Json file to CSV in Pig

I thought this question is asked before but couldn't find it.

I have a JSON file I processed with pig that I want to import to HBASE

file is like this


I want to make CSV file having the key values as the header and the values for the second line, as values. And I want to populate my Hbase database accordingly with other files having same headers with different values, date serial or KPI etc.

Is it a correct approach and how can I do that?

thank you


Re: Converting a Json file to CSV in Pig

I'm a bit confused about the the exact format of the CSV file you are imagining, but I wouldn't be thinking about making a CSV file from pig that had to have a header row, but it can do a great job of providing a data-only CSV file; especially once you're alias has the schema you want it to.

That said, I'm thinking a better strategy (unless you data is gigantic and you know something like ImportTSV's batch import is the only thing that will perform at the scale you are imagining) would be to look at using HBaseStorage in a STORE operation to insert the data into HBase.

There are plenty of resources out there if you choose to look at that approach. I found just now that shows a simple example of this approach. Good luck!!


Re: Converting a Json file to CSV in Pig


@Erdal Kucuk

I would:

  1. load the json file using pig's native JSON loader:
  2. then I would load to hbase as described here: (this describes how to use pig to convert data sets to input for hbase loading)

Don't have an account?
Coming from Hortonworks? Activate your account here