About memad

EtmanY · ‎05-07-2023

As per the OP's response if the data isn't well distributed along the partitioned column you will end up having some very large partitions while others will be very small. Writing into a single large partition can lead Kudu to fail. If your partitioned column is skewed aim for redesigning your table partitioning. Final note: As per Kudu's documentation (Apache Kudu - Apache Kudu Schema Design) Typically the primary key columns are used as the columns to hash, but as with range partitioning, any subset of the primary key columns can be used.

MattWho · ‎02-27-2023

@memad ListSFTP does not actually fetch the content of any files from the target SFTP server. You would need a FetchSFTP processor after listSFTP to do that. The ListSFTP processor results in FlowFile(s) with metadata about the the listed file from the SFTP processor. This metadata is then used by the downstream FetchSFTP processor to retrieve the actual content for each FlowFile. The documentation for the ListSFTP processor covers the attributes that are written to the FlowFile(s): https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.19.1/org.apache.nifi.processors.standard.ListSFTP/index.html This metadata is present on the FlowFile as FlowFile attributes, and you can manipulate and do anything you like with this metadata. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

justenji · ‎02-17-2021

@memad Have a look here at the processor: To create or check cron syntax look here: https://www.freeformatter.com/cron-expression-generator-quartz.html Hope this helps! Bye

MattWho · ‎04-15-2020

@memad If your GetFile processor is consuming files before they have finished writing there are a few changes that may help: 1. How are files being written in to the directory? The default "File Filter" will ignore files that start with a ".". If it is possible to change how files are being written to the directory, that will solve your issue through a file filter. For example.... writing new files to directory as ".<filename>" and upon successful write does a rename to remove the dot (this is how ssh works). But you can of course setup any file filter that works for you, 2. Assuming the process that is writing files to the directory is always updating the timestamp on the file, you can use the "Minimum File Age" property to prevent the GetFile from consuming a file until the last modified timestamp in the file has not updated for the configured amount of time. This works in most cases, except when there may be long pauses in the write process that exceeds the configured Min File Age time. Hope this helps, Matt

memad · ‎11-25-2019

I got errors if the input contains duplicated records

Online	Offline
Last Visited	‎05-11-2023 12:07 PM

Member Since	‎11-20-2019 03:06 AM
Last Visited	‎05-11-2023 12:07 PM
Posts	6

Cloudera Community

Re: HELP PLEASE!!!, Failed to write at least 1000 ...

Re: Nifi ListSFTP cash duration

Re: schedule processor twice a day nifi

Re: NIFI getFile processor read file corrupted

Re: Upserts using Nifi ( updates + Inserts)