Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Converting Nested Json into flat file using pig script

Highlighted

Converting Nested Json into flat file using pig script

Explorer

I am facing issues in converting nested json into flat file using pig. I am trying to process using various methods but none of them are giving me the exact solution. Below is the sample dataset and different methods I am using. The below method is printing only 1st 3 columns. Can someone provide me the exact solution that prints all the data.

{"id":"abcdefgh","key":"abcdefgh","value":{"rev":"1-cdfasefqwvc"},"doc":{"_id":"abcdefgh","_rev":"1-cdfasefqwvc","geometry":{"coordinates":[0.40799993419202753,51.5822858308476],"type":"Point"},"properties":{"frTraj":{"altitude":0,"customerLinkID":"12345","distance":138.7360491082252,"drivingDirection":1,"fixedLat":71.5823084728586,"fixedLon":0.19202754079399934,"heading":0,"latitude":81.582625,"linkID":"12345","longitude":0.824043,"moID":"123457","speed":0.2129800021648407,"tripID":"abbcdd-1234-abcd-a84a-abcd","ts":12345678},"packet":{"meta":{"account":"abcd","event":"track"},"payload":{"asset":"1234567","connection_id":12433453542,"connection_id_str":"325443543432","fields":{"GPS_SPEED":{"b64_value":"wsfwef","value":0.80000212900000003}},"id":443547204223,"id_str":"7443540707204223","index":4435403,"loc":[0.44435443,51.44354625],"received_at":"2022-04-22T14:39:14Z","recorded_at":"2012-04-22T14:37:56Z","recorded_at_ms":"2000-04-22T14:37:56.000Z","streams_received_at":"2022-04-22T14:39:14Z"}},"playId":"123456"},"type":"feature"}},
{"id":"abcdefgh","key":"abcdefgh","value":{"rev":"1-cdfasefqwvc"},"doc":{"_id":"abcdefgh","_rev":"1-cdfasefqwvc","geometry":{"coordinates":[0.40799993419202753,51.5822858308476],"type":"Point"},"properties":{"frTraj":{"altitude":0,"customerLinkID":"12345","distance":138.7360491082252,"drivingDirection":1,"fixedLat":71.5823084728586,"fixedLon":0.19202754079399934,"heading":0,"latitude":81.582625,"linkID":"12345","longitude":0.824043,"moID":"123457","speed":0.2129800021648407,"tripID":"abbcdd-1234-abcd-a84a-abcd","ts":12345678},"packet":{"meta":{"account":"abcd","event":"track"},"payload":{"asset":"1234567","connection_id":12433453542,"connection_id_str":"325443543432","fields":{"GPS_SPEED":{"b64_value":"wsfwef","value":0.80000212900000003}},"id":443547204223,"id_str":"7443540707204223","index":4435403,"loc":[0.44435443,51.44354625],"received_at":"2022-04-22T14:39:14Z","recorded_at":"2012-04-22T14:37:56Z","recorded_at_ms":"2000-04-22T14:37:56.000Z","streams_received_at":"2022-04-22T14:39:14Z"}},"playId":"123456"},"type":"feature"}}
a = LOAD '/user/rahul/cloudtest.json' USING JsonLoader ('id:chararray, key:chararray, value:(rev:chararray), doc:(id:chararray, rev:chararray,geometry:(coordinates: (i: chararray, j: chararray), type:chararray),properties:(frTraj:(altitude:chararray, customerLinkID:chararray, distance:chararray, drivingDirection:chararray, fixedLat:chararray, fixedLon:chararray, heading:chararray, latitude:chararray, linkID:chararray, longitude:chararray, moID:chararray, speed:chararray, tripID:chararray, ts:chararray),packet:(meta:(account:chararray, event:chararray), payload:(asset:chararray, connectionid:chararray, connectionidstr:chararray, fields:( GPSSPEED: (b64value: chararray, value: chararray)), id:chararray, idstr:chararray, index:chararray, loc:(i: chararray, j: chararray), receivedat:chararray, recordedat:chararray, recordedatms:chararray, streamsreceivedat:chararray)),playId:chararray),type:chararray)');
dump a;

Thanks in Advance.

-Rahul

8 REPLIES 8
Highlighted

Re: Converting Nested Json into flat file using pig script

Highlighted

Re: Converting Nested Json into flat file using pig script

Explorer

Hi Neeraj,

I already checked this document and its not working when i am trying with the same syntax that it is showing.

Thanks,

Rahul

Highlighted

Re: Converting Nested Json into flat file using pig script

@Rahul Reddy Kamuru Whats the error?

Re: Converting Nested Json into flat file using pig script

Explorer

I am getting syntax error. Unexpected symbol at or near 'chararray'.

Thanks,

Rahul

Highlighted

Re: Converting Nested Json into flat file using pig script

@Rahul Reddy Kamuru
Did you get any information on line number ? 
Unexpected symbol at or near 'chararray'.

a = LOAD '/user/rahul/cloudtest.json' USING JsonLoader ('id:chararray, key:chararray, value:(rev:chararray), doc:(id:chararray, rev:chararray,geometry:(coordinates: (i: chararray, j: chararray), type:chararray),properties:(frTraj:(altitude:chararray, customerLinkID:chararray, distance:chararray, drivingDirection:chararray, fixedLat:chararray, fixedLon:chararray, heading:chararray, latitude:chararray, linkID:chararray, longitude:chararray, moID:chararray, speed:chararray, tripID:chararray, ts:chararray),packet:(meta:(account:chararray, event:chararray), payload:(asset:chararray, connectionid:chararray, connectionidstr:chararray, fields:( GPSSPEED: (b64value: chararray, value: chararray)), id:chararray, idstr:chararray, index:chararray, loc:(i: chararray, j: chararray), receivedat:chararray, recordedat:chararray, recordedatms:chararray, streamsreceivedat:chararray)),playId:chararray),type:chararray)'); 
dump a;
Highlighted

Re: Converting Nested Json into flat file using pig script

Mentor

@Rahul Reddy Kamuru

Which version of Pig do you use?

I noticed some white spaces hence the syntax error. Unexpected symbol at or near 'chararray'.

eg (i: chararray, j: chararray)

fields:( GPSSPEED: (b64value: chararray, value: chararray))

fields:( GPSSPEED: (b64value: chararray, value: chararray))

Take note depending on the pig version it will e interpreted differently! Can you use Vi to remove the white space

Here is a quick snippet

First delete the blank lines:

:g/^\s*$/d

Then use a substitution (:s///) over each line (%) to replace all (g) continuous whitespace (\s\+) with a comma (,).

:%s/\s\+/,/g

Highlighted

Re: Converting Nested Json into flat file using pig script

Mentor

@Rahul Reddy Kamuru

Geoffrey is correct, you have blanks in your schema. Pasting the findings here as answer doesn't fit in comments section. Once blanks are removed, I get the following result

output of describe a;

a: {id: chararray,key: chararray,value: (rev: chararray),doc: (id: chararray,rev: chararray,geometry: (coordinates: (i: chararray,j: chararray),type: chararray),properties: (frTraj: (altitude: chararray,customerLinkID: chararray,distance: chararray,drivingDirection: chararray,fixedLat: chararray,fixedLon: chararray,heading: chararray,latitude: chararray,linkID: chararray,longitude: chararray,moID: chararray,speed: chararray,tripID: chararray,ts: chararray),packet: (meta: (account: chararray,event: chararray),payload: (asset: chararray,connectionid: chararray,connectionidstr: chararray,fields: (GPSSPEED: (b64value: chararray,value: chararray)),id: chararray,idstr: chararray,index: chararray,loc: (i: chararray,j: chararray),receivedat: chararray,recordedat: chararray,recordedatms: chararray,streamsreceivedat: chararray)),playId: chararray),type: chararray)}

output of dump a;

(abcdefgh,abcdefgh,(1-cdfasefqwvc),)
(abcdefgh,abcdefgh,(1-cdfasefqwvc),)

script without blanks

a = LOAD 'sample.json' USING JsonLoader ('id:chararray, key:chararray, value:(rev:chararray), doc:(id:chararray, rev:chararray,geometry:(coordinates: (i:chararray, j:chararray), type:chararray),properties:(frTraj:(altitude:chararray, customerLinkID:chararray, distance:chararray, drivingDirection:chararray, fixedLat:chararray, fixedLon:chararray, heading:chararray, latitude:chararray, linkID:chararray, longitude:chararray, moID:chararray, speed:chararray, tripID:chararray, ts:chararray),packet:(meta:(account:chararray, event:chararray), payload:(asset:chararray, connectionid:chararray, connectionidstr:chararray, fields:( GPSSPEED:(b64value:chararray, value:chararray)), id:chararray, idstr:chararray, index:chararray, loc:(i:chararray, j:chararray), receivedat:chararray, recordedat:chararray, recordedatms:chararray, streamsreceivedat:chararray)),playId:chararray),type:chararray)');
describe a;
dump a;
Highlighted

Re: Converting Nested Json into flat file using pig script

New Contributor
https://www.welookups.com
  1. a = LOAD 'sample.json' USING JsonLoader('id:chararray, key:chararray, value:(rev:chararray), doc:(id:chararray, rev:chararray,geometry:(coordinates: (i:chararray, j:chararray), type:chararray),properties:(frTraj:(altitude:chararray, customerLinkID:chararray, distance:chararray, drivingDirection:chararray, fixedLat:chararray, fixedLon:chararray, heading:chararray, latitude:chararray, linkID:chararray, longitude:chararray, moID:chararray, speed:chararray, tripID:chararray, ts:chararray),packet:(meta:(account:chararray, event:chararray), payload:(asset:chararray, connectionid:chararray, connectionidstr:chararray, fields:( GPSSPEED:(b64value:chararray, value:chararray)), id:chararray, idstr:chararray, index:chararray, loc:(i:chararray, j:chararray), receivedat:chararray, recordedat:chararray, recordedatms:chararray, streamsreceivedat:chararray)),playId:chararray),type:chararray)');
  2. describe a;
  3. dump a;
,
www.welookups.com

a = LOAD 'sample.json' USING JsonLoader ('id:chararray, key:chararray, value:(rev:chararray), doc:(id:chararray, rev:chararray,geometry:(coordinates: (i:chararray, j:chararray), type:chararray),properties:(frTraj:(altitude:chararray, customerLinkID:chararray, distance:chararray, drivingDirection:chararray, fixedLat:chararray, fixedLon:chararray, heading:chararray, latitude:chararray, linkID:chararray, longitude:chararray, moID:chararray, speed:chararray, tripID:chararray, ts:chararray),packet:(meta:(account:chararray, event:chararray), payload:(asset:chararray, connectionid:chararray, connectionidstr:chararray, fields:( GPSSPEED:(b64value:chararray, value:chararray)), id:chararray, idstr:chararray, index:chararray, loc:(i:chararray, j:chararray), receivedat:chararray, recordedat:chararray, recordedatms:chararray, streamsreceivedat:chararray)),playId:chararray),type:chararray)');describe a;dump a;










web tutorial
Don't have an account?
Coming from Hortonworks? Activate your account here