About mburgess

mburgess · ‎04-24-2017

Using JoltTransformJSON, you can inject the "key" and "theme" entries from p into the array, and create a top-level array from it. Try the following Shift spec: { "operation": "shift", "spec": { "s": { "*": { "@(2,p)": { "key": "[#3].key", "theme": "[#3].theme" }, "*": "[#2].&" } } } } Given your input, it produces the following output: [ { "key" : "k1", "theme" : "default", "x" : 1, "y" : "0.1" }, { "key" : "k1", "theme" : "default", "x" : 2, "y" : "0.2" } ] Now you can use a SplitJson (with a JSON Path of $) to get the individual records. If you want to keep them as JSON then you're done; if you want to convert it to CSV, you'd need EvaluateJsonPath like @Timothy Spann mentioned, then ReplaceText with Expression Language to set the fields, something like "${key}, ${theme}, ${x}, ${y}". An alternative to Jolt, which Tim alluded to, is to use EvaluateJsonPath to get the p.* fields into attributes, then SplitJson, then EvaluateJsonPath to get the s.* attributes, then ReplaceText (either generating JSON or CSV as described). In my opinion I'd use the Jolt transform and keep the content in JSON as long as possible, rather than generating attributes.

mburgess · ‎04-21-2017

You can get the NiFi-only version of HDF at https://hortonworks.com/downloads/#dataflow

mburgess · ‎04-21-2017

Besides the issue in NIFI-2828, there is also a Hive version incompatibility between Apache NiFi and HDP 2.5. Apache NiFi is built with Apache Hive 1.2 and its Apache Hadoop dependencies. However HDP 2.5 has a "newer" version of Hive, based on 1.2 but also with many additions, including a change to the Thrift messages sent between client and server. In order to fix this, I recommend using a Hortonworks Data Flow (HDF) version (2.1.0.0 or above) that is based on NiFi 1.1.0; HDF is built using HDP dependencies/versions for Hive and its Hadoop dependencies. Alternatively you can build your own NiFi from source, using the "hortonworks" profile and overriding the hive.version and hive.hadoop.version properties: mvn clean install -Phortonworks -Dhive.version=1.2.1000.2.5.0.0-1245 -Dhive.hadoop.version=2.7.3.2.5.0.0-1245 -DskipTests

mburgess · ‎04-20-2017

You can use EvaluateJsonPath to extract the one particular value of the input JSON into an attribute, then you can use ReplaceText to create a SQL statement (using NiFi Expression Language and group referencing, see ReplaceText documentation for more details) that refers to both the extracted attribute and the incoming flow file content. Then you can send that to PutSQL to insert into the database. For example, if you have the following JSON: { "id": 100, "user" : { "name": "Joe Smith", "email": "jsmith@mycompany.com", "age": 42 } } And you have a database table called "myTable" with two columns, "id" and "json". Then you could use EvaluateJsonPath to set an attribute "json.id" with a JSONPath expression of "$.id". The the ReplaceText could replace the whole content with something like the following: INSERT INTO myTable VALUES (${json.id}, '$1') I haven't tried this to see if it works as-is, but that is the basic approach. You may have to do something with escaping the JSON or quote characters or something like that.

mburgess · ‎04-10-2017

I should've asked before, what is your use case where you'd need all the JSON fields as attributes then convert back to JSON at the end using AttributesToJSON? If you have a JSON transformation to perform, please consider JoltTransformJSON, it is very powerful and can do the transformation(s) inline rather than moving the JSON fields to attributes and back.

mburgess · ‎04-10-2017

A bit of a hack is to use ConvertJSONToSQL, this will put attributes such as sql.args.N.value (the value of your field) and sql.args.N.type (the JDBC SQL type of the value). Alternatively in the Groovy script you can check the type of the variable v and set another attribute corresponding to the key k such as k.type that contains a data type identifier.

mburgess · ‎04-08-2017

I second @Wynner 's comment about being cautious. If you determine that you still want all the JSON fields as attributes, you can do it all at once with ExecuteScript. Here is a Groovy script that expects a "flat" JSON file and turns all the fields in into attributes: import org.apache.commons.io.IOUtils import java.nio.charset.* def flowFile = session.get(); if (flowFile == null) { return; } def slurper = new groovy.json.JsonSlurper() def attrs = [:] as Map<String,String> session.read(flowFile, { inputStream -> def text = IOUtils.toString(inputStream, StandardCharsets.UTF_8) def obj = slurper.parseText(text) obj.each {k,v -> attrs[k] = v.toString() } } as InputStreamCallback) flowFile = session.putAllAttributes(flowFile, attrs) session.transfer(flowFile, REL_SUCCESS)

mburgess · ‎04-06-2017

Try the following: Route the "failure" relationship to some processor you don't start (like UpdateAttribute) When a flow file fails, right-click on the failure connection and choose List Queue Click the info button (an "i" in a circle) to the left of the flow file, then choose View to see the contents Does that look like a valid SQL statement? Can you try running it from the command line or in a separate PutSQL instance? Also since you are getting the error after time, perhaps the connection is going idle and then failing. Is there any other error information around the error in the log file you mention above? Sometimes there is a "Caused by" line below it that refers to the lower-level error. What version of NiFi are you using? As of NiFi 1.1.0, you can supply a Validation Query to the DBCPConnectionPool, this ensures that the connection is valid. A Validation Query is a query that does as little work as possible while still establishing connectivity to the server. An example of this (for Oracle) is "select 1 from dual". If you have that property available to you, please try something like that as the Validation Query, it may alleviate this issue if it is intermittent and related to idle time.

mburgess · ‎03-30-2017

Try using the full path to powershell.exe as the command, rather than just "powershell.exe". It looks like the JVM doesn't have access to the same PATH variable that includes the location of that executable.

mburgess · ‎03-29-2017

An alternative to using an environment variable is to use the file-based Variable Registry, where you could create a file perhaps called var_registry.properties with the entry PGPASSWORD=<my_pg_password> And edit conf/nifi.properties to use that file for the registry: nifi.variable.registry.properties=var_registry.properties Then you could use PGPASSWORD in your groovy script the same way you'd use an environment variable: ${PGPASSWORD}

Online	Offline
Last Visited	‎10-21-2025 12:05 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎10-21-2025 12:05 PM
Posts	902
Kudos received	657

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: NIFI - How to split non root node (json array)...

Re: Problems with PutHiveStreaming processor in Ni...

Re: Problems with PutHiveStreaming processor in Ni...

Re: nifi - Writing/Inserting entire flowfile conte...

Re: NiFi - convert everything in json to attribute...

Re: NiFi - convert everything in json to attribute...

Re: NiFi - convert everything in json to attribute...

Re: PutSQL failed to update database due to a fail...

Re: how to run powershell scripts in nifi

Re: NIFI exceute script : Not reading the psql pas...