Member since
11-16-2015
905
Posts
666
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 449 | 09-30-2025 05:23 AM | |
| 784 | 06-26-2025 01:21 PM | |
| 691 | 06-19-2025 02:48 PM | |
| 882 | 05-30-2025 01:53 PM | |
| 11472 | 02-22-2024 12:38 PM |
04-24-2017
05:44 PM
2 Kudos
The "Out" number is the (5 minute rolling window) amount of data (count of flow files / size of flow files) that the processor has transferred (not that is queued). Check the Anatomy of a Processor section of the NiFi User's Guide, it has explanations of the statistics and other indicators on a processor.
... View more
04-24-2017
04:54 PM
1 Kudo
If the JSON content is not too large to fit in memory, you could use ExecuteScript for this, Groovy has an XmlSlurper that can parse your XML clob (assuming it has been placed in an attribute via EvaluateJsonPath), and a JsonSlurper (and JsonOutput) that can read/write JSON as objects. For example, given the input: {
"key": "k1",
"clob": "<root><attribute><name>attr1</name><value>Hello</value></attribute><attribute><name>attr2</name><value>World!</value></attribute></root>"
} You could use the following Groovy script in ExecuteScript: import org.apache.commons.io.IOUtils
import java.nio.charset.*
import groovy.json.*
import groovy.util.*
def flowFile = session.get()
if (!flowFile) return
try {
flowFile = session.write(flowFile,
{ inputStream, outputStream ->
def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
// Parse JSON into object
def json = new JsonSlurper().parseText(text)
// Parse XML from clob field into object
def xml = new XmlSlurper().parseText(json.clob)
// Add a field to the JSON for each "attribute" tag in the XML
xml.attribute.each { a ->
json[a.name.toString()] = a.value.toString()
}
// Remove the clob field
json.remove('clob')
// Write the updated JSON object as the flow file content
outputStream.write(JsonOutput.prettyPrint(JsonOutput.toJson(json)).getBytes(StandardCharsets.UTF_8))
} as StreamCallback)
flowFile = session.putAttribute(flowFile, "filename", flowFile.getAttribute('filename').tokenize('.')[0]+'_with_clob_fields.json')
session.transfer(flowFile, REL_SUCCESS)
} catch(Exception e) {
log.error('Error extracting XML fields into JSON', e)
session.transfer(flowFile, REL_FAILURE)
} For the given input, it generates the following output: {
"attr1": "Hello",
"attr2": "World!",
"key": "k1"
} The script extracts the XML text from the "clob" field, then parses it into an object with XmlSlurper, then finds the individual "attribute" tags within, and adds each name/value pair to the original JSON object. For instances where the clob is not too large, it might be helpful to have an "xPath()" or "xmlPath" function in NiFi Expression Language (like the jsonPath() function added in NIFI-1660). Please feel free to file a Jira case to add this feature.
... View more
04-24-2017
04:33 PM
2 Kudos
If a single flow file contains an array and you want to manipulate values within, then @Andy LoPresto's solution is recommended. From your comment on his answer it appears you want to compute the average across multiple flow files. From a flow perspective, how would you know when you were "done" calculating the average? Will you have a running average that is calculated from sum-so-far and count-so-far? Or do you want to take X flow files in, calculate the average, then output the X flow files (or perhaps a single one) with the average for those X flow files? NiFi 1.2.0 (having implemented NIFI-1582) will include the ability to store and calculate state using UpdateAttribute. This can be used to maintain "sum" and "count" attributes, which at any given point would let you calculate the running average. In the meantime (or alternatively), you could use ExecuteScript or InvokeScriptedProcessor to perform this same function. It would be similar to Andy's approach, but would also store the sum-so-far and count-so-far into the processor's State Map. If you are calculating a running average and want to output each flow file as it comes in (adding a "current average" attribute for example), you can use ExecuteScript. If you want to keep the incoming flow files until a total average can be calculated, then you'd need InvokeScriptedProcessor.
... View more
04-24-2017
03:26 PM
3 Kudos
Using JoltTransformJSON, you can inject the "key" and "theme" entries from p into the array, and create a top-level array from it. Try the following Shift spec: {
"operation": "shift",
"spec": {
"s": {
"*": {
"@(2,p)": {
"key": "[#3].key",
"theme": "[#3].theme"
},
"*": "[#2].&"
}
}
}
} Given your input, it produces the following output: [ {
"key" : "k1",
"theme" : "default",
"x" : 1,
"y" : "0.1"
}, {
"key" : "k1",
"theme" : "default",
"x" : 2,
"y" : "0.2"
} ] Now you can use a SplitJson (with a JSON Path of $) to get the individual records. If you want to keep them as JSON then you're done; if you want to convert it to CSV, you'd need EvaluateJsonPath like @Timothy Spann mentioned, then ReplaceText with Expression Language to set the fields, something like "${key}, ${theme}, ${x}, ${y}". An alternative to Jolt, which Tim alluded to, is to use EvaluateJsonPath to get the p.* fields into attributes, then SplitJson, then EvaluateJsonPath to get the s.* attributes, then ReplaceText (either generating JSON or CSV as described). In my opinion I'd use the Jolt transform and keep the content in JSON as long as possible, rather than generating attributes.
... View more
04-21-2017
04:09 PM
You can get the NiFi-only version of HDF at https://hortonworks.com/downloads/#dataflow
... View more
04-21-2017
03:58 PM
2 Kudos
Besides the issue in NIFI-2828, there is also a Hive version incompatibility between Apache NiFi and HDP 2.5. Apache NiFi is built with Apache Hive 1.2 and its Apache Hadoop dependencies. However HDP 2.5 has a "newer" version of Hive, based on 1.2 but also with many additions, including a change to the Thrift messages sent between client and server. In order to fix this, I recommend using a Hortonworks Data Flow (HDF) version (2.1.0.0 or above) that is based on NiFi 1.1.0; HDF is built using HDP dependencies/versions for Hive and its Hadoop dependencies. Alternatively you can build your own NiFi from source, using the "hortonworks" profile and overriding the hive.version and hive.hadoop.version properties: mvn clean install -Phortonworks -Dhive.version=1.2.1000.2.5.0.0-1245 -Dhive.hadoop.version=2.7.3.2.5.0.0-1245 -DskipTests
... View more
04-20-2017
01:13 PM
3 Kudos
You can use EvaluateJsonPath to extract the one particular value of the input JSON into an attribute, then you can use ReplaceText to create a SQL statement (using NiFi Expression Language and group referencing, see ReplaceText documentation for more details) that refers to both the extracted attribute and the incoming flow file content. Then you can send that to PutSQL to insert into the database. For example, if you have the following JSON: {
"id": 100,
"user" : {
"name": "Joe Smith",
"email": "jsmith@mycompany.com",
"age": 42
}
} And you have a database table called "myTable" with two columns, "id" and "json". Then you could use EvaluateJsonPath to set an attribute "json.id" with a JSONPath expression of "$.id". The the ReplaceText could replace the whole content with something like the following: INSERT INTO myTable VALUES (${json.id}, '$1') I haven't tried this to see if it works as-is, but that is the basic approach. You may have to do something with escaping the JSON or quote characters or something like that.
... View more
04-10-2017
07:01 PM
I should've asked before, what is your use case where you'd need all the JSON fields as attributes then convert back to JSON at the end using AttributesToJSON? If you have a JSON transformation to perform, please consider JoltTransformJSON, it is very powerful and can do the transformation(s) inline rather than moving the JSON fields to attributes and back.
... View more
04-10-2017
06:59 PM
A bit of a hack is to use ConvertJSONToSQL, this will put attributes such as sql.args.N.value (the value of your field) and sql.args.N.type (the JDBC SQL type of the value). Alternatively in the Groovy script you can check the type of the variable v and set another attribute corresponding to the key k such as k.type that contains a data type identifier.
... View more
04-08-2017
06:43 PM
11 Kudos
I second @Wynner 's comment about being cautious. If you determine that you still want all the JSON fields as attributes, you can do it all at once with ExecuteScript. Here is a Groovy script that expects a "flat" JSON file and turns all the fields in into attributes: import org.apache.commons.io.IOUtils
import java.nio.charset.*
def flowFile = session.get();
if (flowFile == null) {
return;
}
def slurper = new groovy.json.JsonSlurper()
def attrs = [:] as Map<String,String>
session.read(flowFile,
{ inputStream ->
def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
def obj = slurper.parseText(text)
obj.each {k,v ->
attrs[k] = v.toString()
}
} as InputStreamCallback)
flowFile = session.putAllAttributes(flowFile, attrs)
session.transfer(flowFile, REL_SUCCESS)
... View more