Member since
07-08-2016
260
Posts
44
Kudos Received
10
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2006 | 05-02-2018 06:03 PM | |
4020 | 10-18-2017 04:02 PM | |
1256 | 08-25-2017 08:59 PM | |
1760 | 07-21-2017 08:13 PM | |
7166 | 04-06-2017 09:54 PM |
03-20-2018
03:47 PM
1 Kudo
Hi, I am reading a JSON file , and splitting in to rows by using SplitJSON processor and then i am trying to convert then to AVRO using ConvertRecord or ConvertJSONToAvro. I gave the JSON schema with all the possible columns. It is converting the records when it contains all the fields\elements. But if it misses an element its routing to failures. how do you solve this.?? all the incoming records wont have all the fields in schema.
... View more
Labels:
- Labels:
-
Apache NiFi
03-13-2018
08:13 PM
Hi @Alireza Sadeghi , did you able to solve the issue.?? if so how.?? i am running in to the same issue , it is working if i use SplitJSON--> ConvertJSONToAvro processers . but ruuning in to the same issue as you when i directly use CovertJSONToAvro with Record Schema property set.
... View more
03-01-2018
08:56 PM
@Bryan Bende looks like I have an option to use SegmentContent and MergeContent after I use ReplaceText on each segment. I tried it with 10mb segment size on a 120 mb file and it worked..now will try on the bigger file.
... View more
03-01-2018
07:56 PM
but even if it is treating as one line, I have 16gb for java heap and the file is only 7.5 gb.it seems like I am getting that error after ReplaceText reads 2gb.(if you look at the screen shot above)
... View more
03-01-2018
07:49 PM
@Bryan Bende I think that is what happening there. what options do I have now.?
... View more
03-01-2018
05:38 PM
@Bryan Bende It looks like ReplaceText is not streaming , I am getting out of memory errors. I am trying this with a 7.5 gb file.. 11:33:19 CSTERRORddf8b120-0161-1000-a2c4-c410a98382c1 ReplaceText[id=ddf8b120-0161-1000-a2c4-c410a98382c1] ReplaceText[id=ddf8b120-0161-1000-a2c4-c410a98382c1] failed to process due to java.lang.OutOfMemoryError; rolling back session: java.lang.OutOfMemoryError 11:33:19 CSTERRORddf8b120-0161-1000-a2c4-c410a98382c1 ReplaceText[id=ddf8b120-0161-1000-a2c4-c410a98382c1] ReplaceText[id=ddf8b120-0161-1000-a2c4-c410a98382c1] failed to process session due to java.lang.OutOfMemoryError: java.lang.OutOfMemoryError
... View more
03-01-2018
05:13 PM
@Bryan Bende i will try and see how it goes. But Is there a better approach.? in order for me to use json record processors I am thinking it will fail to read the json records where it has issue with rows like the one mentioned above.
... View more
03-01-2018
04:27 PM
@Bryan Bende, Thanks for your inputs Bryan. This huge JSON file (22gb) we got needs to be cleaned up, while we notified source about the issue and asked them to cleanup before they send it us. i was trying to see if i can use NiFi to fix it. as it is this file is not opening in any json viewers or reporting tools as it has white spaces between integers. For ex look at second Customer_Id record where it has id as 436 796 [{"Customer_Id":236768,"Brand":[{"Brand_Nm":"abcc","Question":[{"Question_Txt":"The type of food prefers is ...","Answer":[{"Answer_Txt":"Combination of Both","Response":[{"Response_Dt":"2017-04-10T18:33:40"}]}]},{"Question_Txt":"My favorite is ...","Answer":[{"Answer_Txt":"Over 17 years old","Response":[{"Response_Dt":"2016-12-07T03:43:57"}]}]}]}]},{"Customer_Id":436 796,"Brand":[{"Brand_Nm":"edfr","Question":[{"Question_Txt":"At the end of the day I am greeted with ...","Answer":[{"Answer_Txt":"kisses","Response":[{"Response_Dt":"2017-04-12T01:54:12"}]}]}]}]} I am trying to read this as text file and replace those..i used the simple getFile-->ReplaceText-->PutFile processors to do it on a 150MB file which worked fine and took a couple of minutes. I am just being cautious with the big file.
... View more
03-01-2018
02:28 PM
Hi , Can anyone share their experiences with files bigger than the jvm.?
... View more