About Garyy

Garyy · ‎04-25-2023

@rohit2811 When you have to figure out your own problem, like I did here, it could be painful. Fortunately, I received an email notification for the post that I totally forgot. What I did was using PutFile to write the CSV flowfiles to the local system, then call PutSQL to execute your Load Data command to load the local csv files to your target DB. I think there should be a processor to load data from flowfiles to the DB directly, but I failed to do so, not sure if NiFi has such as functionality. If anyone figures out how, please let me know.

Garyy · ‎03-25-2021

@MattWho, By saying my UI session getting terminated unexpected, I meant I was forced out of my nifi session a while after I logged in and worked on it, so I had to log in again. I observed more and recorded my issue. I always got kicked out exactly one hour after my last login. So I searched the configuration folder and found the following in login-identity-providers.xml: <property name="Authentication Expiration">1 hour</property> So I believe this is root cause. After updating this value, I should be fine now. Thank you a lot.

Garyy · ‎03-22-2021

Thanks a lot @MattWho ! Your explanation is very educational. I checked my nifi setting. It's a real standalone server. nifi.cluster.is.node=false So, do you know what else could terminate UI sessions?

Garyy · ‎03-19-2021

@MattWho , Do these parameters only apply to the cases of cluster? I am using a standalone nifi server, but my UI sessions frequently get terminated w/o any messages during development. Could something else, such as security settings, also have such effect? Could you please shed more light? Thanks.

Garyy · ‎01-14-2021

Figured out an alternative way. I developed a Oracle PL/SQL function which takes table name as an argument, and produces a series of queries like "SELECT * FROM T1 OFFSET x ROWS FETCH NEXT 10000 ROWS ONLY". The number of queries is based on the number of rows of the table, which is a statistics number in the catalog table. If the table has 1M rows, and I want to have 100k rows in each batch, it will produces 10 queries. I use ExecuteSQLRecord to call this function, which effectively does the job of NiFi processor GenerateTableFetch. My next processor (e.g. ExecuteSQLRecord again) can now have 10 concurrent tasks working in parallel.

Garyy · ‎01-13-2021

I use ExecuteSQLRecord to run a query and write to CSV format. The table has 10M rows. Although I can split the output into multiple flow files, the query is executed by only a single thread and is very slow. Is there a way to partition the query into multiple queries so that the next processor can run multiple concurrent tasks, each one process one partition? It would be like: GenerateTableFetch -> ExecuteSQLRecord (with concurrent tasks) The problem is that GenerateTableFetch only accepts table name as input. It does not accept customized queries. Please advise if you have solutions. I am new to NiFi. So I would like your details. Thank you in advance.

Garyy · ‎01-13-2021

I figured out a workaround myself and hope it's useful for others. I use the following query to generate another query to be executed by the next step. This query converts Oracle date values to the preferred strings at the global level so it save the development effort at column level or table level. SELECT LISTAGG( CASE WHEN COLUMN_ID =1 THEN 'SELECT ' || CASE WHEN DATA_TYPE IN ('DATE','TIMESTAMP') THEN 'TO_CHAR(' || COLUMN_NAME || ',''YYYY-MM-DD HH24:MI:SS'') AS ' || COLUMN_NAME ELSE COLUMN_NAME END ELSE CASE WHEN DATA_TYPE IN ('DATE','TIMESTAMP') THEN 'TO_CHAR(' || COLUMN_NAME || ',''YYYY-MM-DD HH24:MI:SS'') AS ' || COLUMN_NAME ELSE COLUMN_NAME END END ,',') WITHIN GROUP (ORDER BY COLUMN_ID) || ' FROM ' || '${db.table.name}' AS MY_RECORD from user_tab_columns where table_name = '${db.table.name}' ;

Garyy · ‎01-07-2021

I tried "java.arg.8=-Duser.timezone=America/New_York". It does not work for me. I posted one question earlier: https://stackoverflow.com/questions/65620632/why-do-executesqlrecord-and-csvrecordsetwriter-updated-the-time-zone-of-datetime

Garyy · ‎01-07-2021

Hello! I am new to NiFi. I hope someone here can advise me about my problem with time zone. I have these processors: ListDatabaseTables -> GenerateTableFetch -> ExecuteSQLRecord (writing to csv file by CSVRecordWriter) -> ... ... PutSQL (loading csv file to MySQL using Load Data command) The source DB is Oracle. CSVRecordWriter has the following properties: Schema Write Strategy -> Do Not Write Schema Schema Access Strategy -> Inherit Record Schema Schema Name -> ${schema.name} Schema Text -> ${avro.schema} Date Format -> yyyy-MM-dd Time Format -> HH:mm:ss Timestamp Format -> yyyy-MM-dd HH:mm:ss My source DB and the target DB are both in US east time zone. However, I noticed that the output of ExecuteSQLRecord having time values converted to UTC (added to 5 hours). That results in the wrong time values in the target DB. There may be some ways to convert each date/time column individually, but that will require a huge amount of development effort. Is there a way to handle this issue properly at global level, or at least at table level? Please note that Time Format needs to be acceptable to MySQL Load Data. Thank you in advance! Gary

Online	Offline
Last Visited	‎04-25-2023 02:11 PM

Member Since	‎01-07-2021 12:12 PM
Last Visited	‎04-25-2023 02:11 PM
Posts	9
Kudos received	1

Cloudera Community

Re: Why do ExecuteSQLRecord and CSVRecordSetWriter...

Re: NiFi web UI timeouts

Re: Looking for something like GenerateTableFetch

Re: Why do ExecuteSQLRecord and CSVRecordSetWriter...

Re: Why do ExecuteSQLRecord and CSVRecordSetWriter...

Re: NiFi web UI timeouts

Re: NiFi web UI timeouts

Re: NiFi web UI timeouts

Re: Looking for something like GenerateTableFetch

Looking for something like GenerateTableFetch

Re: Why do ExecuteSQLRecord and CSVRecordSetWriter...

Re: Change Timezone in Nifi

Why do ExecuteSQLRecord and CSVRecordSetWriter upd...