Member since
11-16-2015
902
Posts
664
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 166 | 09-30-2025 05:23 AM | |
| 622 | 06-26-2025 01:21 PM | |
| 458 | 06-19-2025 02:48 PM | |
| 710 | 05-30-2025 01:53 PM | |
| 9738 | 02-22-2024 12:38 PM |
03-15-2017
12:52 PM
1 Kudo
For CSV files, if you know the number and type of column values, you can use SplitText (to get one row per flow file) followed by ExtractText, supplying a regular expression to get the column values out into flow file attributes. Then you can use ReplaceText to manually enter a SQL INSERT statement (using NiFi Expression Language to access the attributes). For other formats like Avro, as we don't currently have a ConvertAvroToSQL processor, you'd have to convert them for now. Work is underway for a generic system of type conversions, such that you could specify Avro as your input format and perhaps "SQL INSERT" as your output format, thereby effectively making the generic processor work like a ConvertAvroToSQL processor.
... View more
03-15-2017
12:46 PM
1 Kudo
Is your processor in its own NAR, or have you added it to a NiFi NAR (such as the nifi-hive-bundle or nifi-hdfs-bundle)? If the former, have you added the nifi-hadoop-libraries NAR as a parent to your NAR? This will give you access to the Hadoop JARs/classes via a parent classloader. To add this NAR as a parent, add the following to the <dependencies> section in your custom processor's NAR module (not the processor module itself): <dependency>
<groupId>org.apache.nifi</groupId>
<artifactId>nifi-hadoop-libraries-nar</artifactId>
<type>nar</type>
</dependency> Can you describe your use case a little more? If your files are already in ORC format, you should be able to use PutHDFS to place them onto the Hadoop cluster. If they are in some other format, you might be able to use some conversion processors (including ConvertAvroToORC) and then PutHDFS to land the resultant ORC files into the cluster.
... View more
03-15-2017
12:41 PM
1 Kudo
According to this, Salesforce requires authentication via OAuth 2.0. Please see this HCC article for a discussion (and related links) on how to achieve this. The question is about the GetHttp processor but should apply to InvokeHttp as well.
... View more
03-13-2017
02:27 AM
1 Kudo
It does convert Avro to ORC, but ironically it does so by converting the Avro to JSON first. We should allow ORC (if prudent, possibly after a ConvertAvroToORC processor) and JSON as valid input formats (perhaps via the mime.type attribute?) to PutHiveStreaming. Do you mind writing up a Jira for this? Please and thank you 🙂
... View more
03-10-2017
08:48 PM
Try forming your Database Driver Location value as a URL, such as "file://C/path/to/sqljdbc4.jar" I think folks have had trouble with Windows pathnames. For the "dynamic" Database Driver Location value: that property accepts NiFi Expression Language. Of course there's no flow file available from which to grab attributes, but you can use the NiFi Variable Registry to set the value of that (and other) properties. This would allow you to change the values in one place, and/or to have different values for different environments (dev, test, production, e.g.). Is that what you mean by not hard-coding the location?
... View more
03-08-2017
02:44 PM
I'm not familiar enough with Sqoop to know if they have any options that don't involve a max-value column. As you point out, if there's no way from a row to tell if it is "new", then you have to check the whole table.
... View more
03-08-2017
02:34 PM
Is there anything else in the logs after that, perhaps a "Caused by" section?
... View more
03-07-2017
08:29 PM
2 Kudos
NIFI-3413 describes a GetChangeDataCapture processor, which will generate events that correspond to database updates (rows added, tables changed, e.g.). The hope is that it will support MySQL, Oracle, and SQL Server (although the first release of the processor may not). Development of the MySQL CDC support (via its binlog files) is underway.
... View more
03-07-2017
04:19 PM
You can use GenerateFlowFile to create 0-Byte flow files. Recent versions of NiFi also let you supply dynamic properties to GenerateFlowFile which means you can add a "filename" attribute set to "${filename}.sem", the filename part will be a timestamp/number. If your version of NiFi doesn't support dynamic properties in GenerateFlowFile, you can use UpdateAttribute afterward to do the same. Note that GenerateFlowFile runs continuously on the schedule you pick, so there isn't a concept of being "done" or creating a "batch". Instead you could manually (or via the REST API) stop the processor when you have the requisite number of flow files created. To write the files somewhere, use the PutFile processor.
... View more
03-07-2017
02:51 PM
ReplaceText uses regular expressions, so if you can write a regex that will successfully match the field names you want, you should be able to use that. I usually avoid that approach for structured text like JSON, especially if I don't know what the input will look like. But if you know the format/structure, you may find that ReplaceText works fine for your needs.
... View more