Support Questions

Find answers, ask questions, and share your expertise

Can't use GenerateTableFetch with maximum value column timestamp in postgres database

avatar
Contributor

Hey there,

I want to configure a data flow where I load new data as they get saved in a postgres database. For that I want to use the timestamp column in the GenerateTableFetch processor as maximum value column. Sadly the processor is not able to generate the right sql statements.

Unable to execute SQL select query SELECT COUNT(*), MAX(time) time FROM log_table due to ERROR: syntax error at or near "time"

It seems to have a problem with the alias, as would need the 'as' statement in postgres, but it seems to use no alias statement at all and fails because of this. Is there a way to fix that?

Any help would be appreciated.

Best regards,

Markus

1 ACCEPTED SOLUTION

avatar
Master Guru

In this case the problem is that "time" is a reserved word in PostgreSQL, so it needs an explicit "AS" for the alias. If the column were not a reserved word, the processor should work fine (I tested with a timestamp column named "ts" and it works without the AS). I have written NIFI-5419 to handle the improvement of aliases in the DB fetch processors (such as GenerateTableFetch).

A workaround is to use QueryDatabaseTable, you can't distribute the SQL among a cluster, but it won't issue that MAX() query that fails.

View solution in original post

3 REPLIES 3

avatar
Master Guru

In this case the problem is that "time" is a reserved word in PostgreSQL, so it needs an explicit "AS" for the alias. If the column were not a reserved word, the processor should work fine (I tested with a timestamp column named "ts" and it works without the AS). I have written NIFI-5419 to handle the improvement of aliases in the DB fetch processors (such as GenerateTableFetch).

A workaround is to use QueryDatabaseTable, you can't distribute the SQL among a cluster, but it won't issue that MAX() query that fails.

avatar
Contributor

Thanks for your answer. Your mention of using quotes in the ticket gave me the idea of stating the table name as "time" in the processor. With that it seems to work now. I don't know if this leads to other problems though.

Edit: It seems like the where clause doesn't get insert in the statements. I don't know if it needs that though? If I try other kind of columns like numbers it puts a WHERE <column> <= <max_value> in the generated statement.

avatar
Master Guru

Once you put special characters in the table name, column names, etc. then it can cause problems for the fetch processors. Sometimes we get the table/column names from what is provided by the user in the config, and sometimes we get the table/column names from the result set(s). With special characters or other tweaks this can cause a mismatch when the processor tries to determine the last known max value, etc.