Member since
11-16-2015
892
Posts
649
Kudos Received
245
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5440 | 02-22-2024 12:38 PM | |
1362 | 02-02-2023 07:07 AM | |
3045 | 12-07-2021 09:19 AM | |
4170 | 03-20-2020 12:34 PM | |
14032 | 01-27-2020 07:57 AM |
02-20-2019
02:55 PM
1 Kudo
Is this a standard format (I know it's not proper JSON but is it some other standard format)? If not, you could write a ScriptedReader to parse your records. If all the data looks like the above though, you could use ReplaceText to replace ( with [ and ) with ], also remove the "u" prefix from some strings (like the address field), thereby converting it to proper JSON so you can use components like JsonTreeReader for example.
... View more
02-20-2019
02:51 PM
1 Kudo
AFAIK a thread can only be terminated like that manually, perhaps someone right-clicked on the processor and chose Terminate? A Terminated thread is really an "interrupted" thread, once it has been interrupted it should close gracefully but I don't believe there is any such guarantee. In any case, the processor should continue to run successfully even with terminated threads, although there may be an underlying issue with why someone terminated the thread to begin with (infinite timeout, e.g.)
... View more
02-15-2019
06:43 PM
It's possible (but would be quite unfortunate) that there was a Thrift change between 1.2.1 and 1.2.2, there were quite a few cases that went into 1.2.2.
... View more
02-15-2019
06:35 PM
1 Kudo
Yes we currently build with JDK 8 so we get Nashorn "for free". This will be the case when building/shipping with Java 9 and 10 as well. However Nashorn is deprecated in JDK 11 (but we'll allow it as an option as long as the ScriptEngine is still available in the JRE). We could consider adding support for Google V8 (perhaps via this library, although I'm not sure about licensing or native target) or others.
... View more
02-14-2019
06:28 PM
1 Kudo
If your ID values can come in out of order, then it's not a good choice for a Maximum Value Column. Usually timestamps or always-increasing values are used as Maximum Value Columns. If you don't have a column of this type, how would you know that the row was new or not? In the worst case you could keep a duplicate copy of the table as GenerateTableFetch/QueryDatabaseTable knows about it, then do a JOIN against the current table to find the new rows, but that is very resource-intensive and does not scale well at all.
... View more
02-14-2019
03:08 PM
The Hive processors in Apache NiFi work against Apache Hive 1.2.1, not Hive 2.x. Hive 2.x has difference in the Thrift interface from Hive 1.x, so you won't be able to use the NiFi Hive 1.2.x processors against any Hive 2.x instance (Apache or vendor). There may be work someday to support Hive 2.x but currently only Hive 1 and Hive 3 are supported in NiFi.
... View more
02-13-2019
08:08 PM
1 Kudo
For #1 you can use QueryDatabaseTable or GenerateTableFetch -> ExecuteSQL, you can set the Maximum Value Column property to your timestamp property. If you schedule the processor to run once a day, it will get all records added since the maximum timestamp observed the last time the processor ran. It doesn't have fidelity based on the timestamp value itself, instead it keeps track of the maximum value it's seen so far, then adds a WHERE clause to the SQL statement to get all rows with a timestamp greater than its maximum observed value so far. For those rows, it keeps track of the new current maximum value, and so on. For #2 you'd need a way to know a row was deleted in the source. If you can intercept when the DELETE statement is issued to the target, you could at that time issue a DELETE to the target. Alternatively if your database has Change Data Capture (CDC) support, you may be able to query the delta tables or something. For MySQL we have the CaptureChangeMySQL processor, which reads the binary logs and sends each event downstream in your flow. In that case you'd get an event for the delete, which you can change to a SQL delete statement for PutSQL, or (better to) use PutDatabaseRecord using the "statement.type" attribute, which you would set to the value of the "cdc.event.type" attribute via an UpdateAttribute processor. The "cdc.event.type" attribute is set by the CaptureChangeMySQL processor.
... View more
02-11-2019
06:28 PM
In an upcoming release you'll be able to use Hive 1.1 processors, so in your case you'd want to keep what you have (Avro in HDFS) and use PutHive_1_1QL to issue a LOAD DATA or CREATE EXTERNAL TABLE statement so Hive can see your data.
... View more
02-11-2019
03:20 PM
1-3: The processors that use a DMC client use the DMC in a very specific manner, so they CRUD cache entries as it applies to their operations. There isn't currently a generic processor that lets you call arbitrary cache API methods, that's what the scripting components are for. 4: We don't have the concept of tables in DMC, only key/value pairs. A table can probably be implemented by namespacing the key, not sure if the processors you're using support custom keys though. 5: The DMC operation in a cluster is very similar to how it works on a local installation, except there is a DMC server created on each node in the cluster. However a DMC client still has to choose a single host:port to connect to, and the individual DMC servers are not coordinated at the cluster, meaning if you update one, the others don't get that update; they are fully separate at the moment. 6: AFAIK there is no best practice as far as choosing a DMC server to connect to, other than choosing one on a node that tends to be available most often. You basically get individual, isolated instances to choose from. We have other DMC server implementations that possibly support High Availability and/or Data Durability, such as HBase- or Redis- backed solutions. However neither of these are included with an Apache NiFi distribution, you'd have to bring your own.
... View more
01-10-2019
03:28 PM
What does your schema look like? I'm guessing that your "imu_vg_z" field is of some type (maybe "bytes"), rather than a nullable type (such as ["null", "bytes"])?
... View more