About cotopaul

cotopaul · ‎09-25-2023

@LKB, you could try encapsulating your column names in single or double quotes or backticks in NiFi, instead of using the square brackets. Something like: select `06 QQQ`, `11 JJJ`, `12 KKK`, `13 JJJ` from DB_TEST.dbo.FGRID or like this: select "06 QQQ", "11 JJJ", "12 KKK", "13 JJJ" from DB_TEST.dbo.FGRID In SQL, column names and aliases should not start with numeric digits or contain spaces unless they are enclosed in backticks (``) or double quotes (") depending on the database system. Using DBeaver, those backticks are replaced by double quotes.

cotopaul · ‎09-20-2023

@need_help, try replacing MergeContent with MergeRecords. I assume that each error log gets generated in a single flow file. Using MergeRecord you could achieve something similar but you will need to create two Controller Services: 1 for CSV Reading and 1 for CSV Writing, both of them using Inherit Schema. Next, you can group as many records as you would like and send them to your PutEmail processor. This is how I used it so far and it works pretty well for my use case.

cotopaul · ‎09-11-2023

@JohnSilver, first of all, I recommend you to set your processor on DEBUG, as it might provide you much more information that what you are seeing right now. In addition, have a look in the nifi-app.logs, as you might find something there as well. Next, I do not know how Kudu is configured on your side, but in any project I was involved, it required a authentication - the Kerberos Properties which seem to be blank on your side. Even though you might have a basic install of impala/kudu, as far as I know, it still requires some sort of authentication.

cotopaul · ‎09-11-2023

@BillyG, set the property "Use Avro Logical Types" to true if you would like to use anything else besides STRINGS :).

cotopaul · ‎08-31-2023

@code_mnkey, what you are trying to achieve is not directly doable with a single command. You would need to build a script which will execute your unzip command, list all the files which have been extracted, create an attribute in your new flowfile and send it down the stream into further processing. The language in which you build the script is mostly up to you, as long as you have everything installed on your NiFi machine. Another way, besides ExecuteStreamCommand, is to use ExecuteScript and build a script integrated with the NiFi's logic. A very good example of how you can achieve most of your expected actions can be found here: https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922 Make sure that you read all three parts to fully understand how to generate flowfile, compose attributes and send newly generated flowfiles down the stream.

cotopaul · ‎08-29-2023

@MukaAddA, I see that you are linking two SUCCESS queues to your PutFile. Your PutFile will take all the items from both queues in a random order I assume. Try removing one of the success queues and see what happens. Besides that, see how you configured PutFile to handle the files with the same name, especially if you are using the name further in your processing. In addition, set you processor on Debug and see what it displays and maybe you get a hint from there regarding you problem.

cotopaul · ‎08-28-2023

@JohnnyRocks, as @steven-matison said, you should avoid linking so many ReplaceText. I am not quite sure I understood your flow exactly, but something tells me that before reaching ReplaceText, something is not properly configured in your NiFi Flow. First of all, when using the classic Java Data Format, MM will always transpose in a two digit month, meaning that month from 1 to 9 will be automatically appended with a leading zero. "dd" will do the same trick but for days. As I see in your post, you said that your CSV reader is configured to read the data as MM/dd/yy, which should be fine, but somehow something is missing here ---> How do you reach the format of dd/MM/yyyy? What I would personally try to do is to convert all those date values in the same format. So instead of all those ReplaceText, I would try to insert an UpdateRecord Processor, where I would define my RecordReader and my RecordWritter with the desired schemas (make sure that your column is type int with logicaly type date). Next, in that processor, I would change the Replacement Value Strategy into "Record Path Value" and I would press on + and add a new property. I would call it "/Launch_Date" (pay attention to the leading slash) and I would assign it the value " format( /Launch_Date, "dd/MM/yyyy", "Europe/Bucharest") " (or any other timezone you require -- if you require your data in UTC, just remove the coma and the timezone).

cotopaul · ‎08-28-2023

Well I am not near a PC to test right now, but my initial thoughts are that the problem is related to how your raw data is coming in your flow. As I can see, you have both an INT value and a FLOAT Value .... and not a constant data type: "stake" : 0, "stake" : 0.5, Now, you set your Schema Access Strategy to Inherit Record Schema.This is correct in most cases, but in your case it is not, because your data is not stable. If two files are going into your MergeRecord and one has the value 0 and one has the value 0,5, you will have two different schemas, meaning that the files cannot be merged accordingly. If the first file comes as an INT, your second flowfile (or all the others coming right after) will automatically be converted to an INT value, no matter their value. To avoid this, you will have to generate the schema manually and change your RecordReader and your RecordWritter from Inherit Record Schema to "Use Schema Text Property" and define your schema manually in the new field (which will appear upon the switch). Make sure that in your schema that field is defined with a data type which accepts fractional data and not just an int value.

cotopaul · ‎08-28-2023

Well first of all, how does the data look like before entering MergeRecord? Secondly, how did you configure both the Reader and the Writer? You pasted the configuration for MergeRecord, where this has nothing to do with how the data gets transformed.

cotopaul · ‎08-28-2023

@dulanga, as far as I can tell from your previous post, you have around 3GB of RAM Memory available on your NiFi node, but you are assigning much more to your JVM. So, you have: total used free shared buff/cache available Mem: 3.8Gi 1.5Gi 2.1Gi 145Mi 269Mi 2.1Gi Swap: 511Mi 511Mi 0B But you are assigning much more to your JVM: # JVM memory settings java.arg.2=-Xms4096m java.arg.3=-Xmx8192m Try correcting your config files and assign the correct value for your JVM, in the bootstrap.conf file. Here are some best practices: https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999

Online	Offline
Last Visited	‎03-14-2024 06:37 AM

Member Since	‎01-27-2023 08:25 AM
Last Visited	‎03-14-2024 06:37 AM
Posts	229
Kudos received	73

Cloudera Community

Re: About mergecontent question

Re: how can get the content of Json record and val...

Re: DBCP Connection Pool can't connect to "Progres...

Re: terminate kafka connection if publish kafka pr...

Re: Not able to delete an inifinite loop built wit...

Re: How do I reference column names with spaces in...

Re: MergeContent Issue Nifi Apache

Re: PutKudu processor on NiFi - KUDU generic error

Re: ExecuteSQL Processor with Parquet record write...

Re: ExecuteStreamCommand to call 'unzip -l'

Re: Object in the flow stuck between PutFile and E...

Re: Multiple ReplaceText Processors

Re: NiFi MergeRecord change number double- int

Re: NiFi MergeRecord change number double- int

Re: NIFI won't start throws java.util.concurrent.T...