Member since
02-07-2019
1948
Posts
129
Kudos Received
26
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
259 | 02-01-2024 10:51 PM | |
2188 | 01-22-2024 08:42 PM | |
872 | 10-18-2023 10:07 PM | |
1287 | 07-24-2023 10:27 PM | |
2305 | 05-08-2023 12:28 AM |
04-18-2024
10:11 PM
@nagababu, Welcome to our community! To help you get the best possible answer, I have tagged in our Spark experts @RangaReddy who may be able to assist you further. Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
... View more
04-18-2024
01:30 PM
@s198 I think step one would be looking more into the failures. Are the failures always with rename of dot file? Put SFTP is writing to a dot file (hidden file) and then upon write completion moves file from .xyz to xyz. You also never shared your complete putSFTP processor configuration. 1. Did you inspect the SFTP server log for any logging related to the failures you encountered? 2. What is being done with the files once placed on the SFTP server? Is there some other process consuming them from there? 3. Any chance that other process is consuming the dot files (hidden files) before NiFi has a chance to rename them? 4. Any of the FlowFiles queued have the same "filename" attribute as another FlowFile or a file already present on the target SFTP server? (this is a common issue where the file of same name still exists on the target when the other is written as dot file and then rename fails. Then on retry some process consumed the duplicate and the new is then successful on rename). As far as option 3 and 4 go, both introduce some latency in your dataflow. with (3) the processor only get scheduled once every 30 seconds. So FlowFiles will queue up every 30 seconds. The putSFTP processor has a batch setting for how many FlowFiles will get processed in that execution. If more FlowFiles are queued then that batch setting, he extras will sit until next time processor is scheduled. My concern is that latency introduced my options 3 and 4 may simply be masking the actual issue needing to be addressed. With (4) the processor gets scheduled as fast as possible, but when it executes the thread remains active for 500ms working on as many FlowFiles as possible in the single execution. Then at 500ms it close out that thread and the processor (assuming run schedule of 0) would immediate schedule the processor again. As far as which is better, it is about getting best performance throughput with least amount of latency. Data volumes, sizes, etc come it play here. I typically favor option 4 myself. But if option 3 still works for you but with a much lower runs schedule (30 secs is a lot of latency for a continues flow) Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
04-18-2024
06:44 AM
@double_w Can you share some details on which specific components you are using that appear to lose state after upgrade? The upgrade from 1.13.2 to 1.25.0 is a large leap. Did you test out your dataflow after upgrading to 1.25.0? Was state still working correctly before migrating then to 2.0.0-M3? I am unaware if any way to migrate local state to Zookeeper. While I do not have an answer for you here, the more details you share the more i can look into it possibly as I have time. Thanks, Matt
... View more
04-17-2024
07:16 AM
1 Kudo
https://community.cloudera.com/t5/Support-Questions/ExecuteScript-error-ECMAScript-is-missing/m-p/346336 Nashorn is removed from JDK 15, NiFi 2 uses JDK 21 Keep an old JDK 8 to run
... View more
04-17-2024
02:11 AM
1 Kudo
@gowthamsanjam, Welcome to our community! To help you get the best possible answer, I have tagged in our CDP experts @Rajat_710 @upadhyayk04 @utrivedi who may be able to assist you further. Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
... View more
04-16-2024
08:08 PM
The password is correct but the same problem occurs
... View more
04-15-2024
01:08 AM
1 Kudo
Hi @soumM can you please check if both the cluster nodes are in /etc/hosts file on each node. We need a full error stack to debug this.
... View more
04-14-2024
11:54 PM
1 Kudo
Hello @RamaClouder Hope you have Synced with Cloudera Sales Team [0] & have no further ask. To Sum up of the Post, Cloudera offers Unified Governance within the Platform on Public, Private & Hybrid Cloud Setup. The Link [1] has relevant Doc/Videos Link for further Understanding. If you have any further ask, Feel free to post in our Community & We shall get back to you accordingly. - Smarak [0] https://www.cloudera.com/contact-sales.html [1] https://www.cloudera.com/products/cloudera-data-platform/sdx.html
... View more
04-14-2024
10:58 PM
@Richardxu18, as this is an older article, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this article as a reference in your new post.
... View more
04-11-2024
09:58 PM
1 Kudo
@upadhyayk04 Did the response assist in resolving your query? If it did, kindly mark the relevant reply as the solution, as it will aid others in locating the answer more easily in the future. However, if you still have concerns, could you please provide the information that @upadhyayk04 has requested?
... View more