Member since
12-27-2019
4
Posts
0
Kudos Received
0
Solutions
12-30-2019
05:53 AM
Thank you for the input, this was helpful advice!
... View more
12-30-2019
05:51 AM
Thank you, this worked perfectly!
... View more
12-27-2019
05:20 PM
I am a new NiFi and MongoDB user. I'm attempting to create an ingest process that first accepts a JSON file that will provide the information about a project (project submission id, project name, project submittor, project attachments (array of documents). I am able to parse the JSON information until I get to the array of project attachments. The project attachments field is an array with the name of one or more project documents (ex: attachment1.docx, attachment2.ppt, etc). The end goal is to put each project submission in MongoDB as a document. The project attachments field must be an array so I can reference them later when parsing the metadata from each of the attachment files. I want to be able to nest the metadata for each attachment to correspond to each attachment file in the array.
So far, I am the problem I am running into is that the final project attachments array somehow becomes a string instead of an array after passing through the AttributesToJSON processor. I can see that the JSON looks fine until it gets to this processor then after AttributesToJSON, suddenly there are quotes around the array like this: "file_attachment" : "[\"attachment1.docx\",\"attachment2.docx\",\"attachment3.docx\",\"attachment4.docx\"]"
Because of this, the PutMongoRecord processor enters the array in MongoDB as a string instead of array.
I'm using the following processors:
GetFile - read in json file
SplitJson
EvaluateJsonPath - destination = flowfile-attribute, return type=json, renaming some json fields as they are converted to flowfile attributes (submission id = $.id, abstract = $.abstract, file_attachment = $.data.[*] <---COULD this be part of the problem?)
NOTE: after passing through this processor, file_attachment shows up like this in the attributes:
["attachment1.docx","attachment2.docx","attachment3.docx","attachment4.docx"]"
AttributesToJSON - specify the flowfile attributes to write to flowfile content. When viewing the flow-file content the fields have converted correctly EXCEPT for tile attachment:
"file_attachment" : "[\"attachment1.docx\",\"attachment2.docx\",\"attachment3.docx\",\"attachment4.docx\"]"
Any idea how to get around this issue? Thank you for any suggestions!
... View more
Labels:
- Labels:
-
Apache NiFi