Member since
07-30-2019
5
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4821 | 11-16-2019 02:19 PM |
05-31-2022
06:54 AM
Take a look at the capabilities of Apache Atlas. Depending on your specific needs this may fit your requirements: https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/howto-governance.html
... View more
11-16-2019
02:19 PM
If the original input data (before your Jolt transform) looks something like this... {
"headers": [
"header1",
"header2",
"header3"
],
"rows": [
[
"row1",
"row2",
"row3"
],
[
"row4",
"row5",
"row6"
]
]
} ...then it may be easier using SplitJSON and EvaluateJSONPath processors in this scenario. The GenerateFlowFile processor has some sample data that matches the above format using books as the example. The result will be a separate JSON file for each array within "rows". You can configure the MergeContent to bundle the JSON records as needed. EvaluateJSONPath SplitJSON EvaluateJSONPath ReplaceText
... View more