Member since
11-16-2015
905
Posts
665
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 442 | 09-30-2025 05:23 AM | |
| 777 | 06-26-2025 01:21 PM | |
| 676 | 06-19-2025 02:48 PM | |
| 863 | 05-30-2025 01:53 PM | |
| 11441 | 02-22-2024 12:38 PM |
03-15-2017
11:38 PM
3 Kudos
Yes it is possible with ExecuteScript if nothing else. Try the following Groovy script in your ExecuteScript processor: def flowFile = session.get()
if(!flowFile) return
class WriteCallback implements OutputStreamCallback {
Map attrs
WriteCallback(attributes) {
attrs = attributes
}
void process(OutputStream outputStream) {
outputStream.write('<root>\n'.bytes)
attrs.each {k,v ->
outputStream.write("<property>\n\t<name>$k</name>\n\t<value>$v</value>\n".bytes)
}
outputStream.write('</root>'.bytes)
}
}
def wb = new WriteCallback(flowFile.attributes)
flowFile = session.write(flowFile, wb)
flowFile = session.putAttribute(flowFile, org.apache.nifi.flowfile.attributes.CoreAttributes.MIME_TYPE.key(), 'application/xml')
session.transfer(flowFile, REL_SUCCESS) This should pretty-print your attributes in a "properties-style" XML format. Of course you can edit the script to give you the schema you like.
... View more
03-15-2017
01:11 PM
1 Kudo
In addition to QueryDatabaseTable, you may be interested in the GenerateTableFetch processor. It is similar to QueryDatabaseTable except that it does not execute SQL queries, it generates them and sends out flow files with SQL queries. This allows you to distribute to the fetching in parallel over a NiFi cluster. In an upcoming release, GenerateTableFetch will accept incoming flow files, so you could enhance the workflow with the ListDatabaseTables processor, sending those tables to GenerateTableFetch, thus parallelizing the fetching of multiple pages of multiple tables.
... View more
03-15-2017
12:56 PM
1 Kudo
Currently NiFi does not support XLS as a format, but there has been a community contribution to add a ConvertExcelToCSV processor under NIFI-2613.
... View more
03-15-2017
12:52 PM
1 Kudo
For CSV files, if you know the number and type of column values, you can use SplitText (to get one row per flow file) followed by ExtractText, supplying a regular expression to get the column values out into flow file attributes. Then you can use ReplaceText to manually enter a SQL INSERT statement (using NiFi Expression Language to access the attributes). For other formats like Avro, as we don't currently have a ConvertAvroToSQL processor, you'd have to convert them for now. Work is underway for a generic system of type conversions, such that you could specify Avro as your input format and perhaps "SQL INSERT" as your output format, thereby effectively making the generic processor work like a ConvertAvroToSQL processor.
... View more
03-15-2017
12:46 PM
1 Kudo
Is your processor in its own NAR, or have you added it to a NiFi NAR (such as the nifi-hive-bundle or nifi-hdfs-bundle)? If the former, have you added the nifi-hadoop-libraries NAR as a parent to your NAR? This will give you access to the Hadoop JARs/classes via a parent classloader. To add this NAR as a parent, add the following to the <dependencies> section in your custom processor's NAR module (not the processor module itself): <dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-hadoop-libraries-nar</artifactId>
<type>nar</type>
</dependency> Can you describe your use case a little more? If your files are already in ORC format, you should be able to use PutHDFS to place them onto the Hadoop cluster. If they are in some other format, you might be able to use some conversion processors (including ConvertAvroToORC) and then PutHDFS to land the resultant ORC files into the cluster.
... View more
03-15-2017
12:41 PM
1 Kudo
According to this, Salesforce requires authentication via OAuth 2.0. Please see this HCC article for a discussion (and related links) on how to achieve this. The question is about the GetHttp processor but should apply to InvokeHttp as well.
... View more
03-13-2017
02:27 AM
1 Kudo
It does convert Avro to ORC, but ironically it does so by converting the Avro to JSON first. We should allow ORC (if prudent, possibly after a ConvertAvroToORC processor) and JSON as valid input formats (perhaps via the mime.type attribute?) to PutHiveStreaming. Do you mind writing up a Jira for this? Please and thank you 🙂
... View more
03-10-2017
08:48 PM
Try forming your Database Driver Location value as a URL, such as "file://C/path/to/sqljdbc4.jar" I think folks have had trouble with Windows pathnames. For the "dynamic" Database Driver Location value: that property accepts NiFi Expression Language. Of course there's no flow file available from which to grab attributes, but you can use the NiFi Variable Registry to set the value of that (and other) properties. This would allow you to change the values in one place, and/or to have different values for different environments (dev, test, production, e.g.). Is that what you mean by not hard-coding the location?
... View more
03-08-2017
02:44 PM
I'm not familiar enough with Sqoop to know if they have any options that don't involve a max-value column. As you point out, if there's no way from a row to tell if it is "new", then you have to check the whole table.
... View more
03-08-2017
02:34 PM
Is there anything else in the logs after that, perhaps a "Caused by" section?
... View more