Member since
01-27-2023
126
Posts
31
Kudos Received
23
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
64 | 05-31-2023 03:01 AM | |
151 | 05-22-2023 06:55 AM | |
154 | 05-15-2023 05:33 AM | |
300 | 05-10-2023 01:57 AM | |
100 | 05-09-2023 11:40 PM |
05-04-2023
06:30 AM
You do not install the Cloudera version on your laptop 🙂 You need the Cloudera DataFlow for Public Cloud (CDF-PC), meaning that we are talking here about a license and some services. As @steven-matison already provided you with the perfect answer for your question, he might also be in the position to further assist you with everything you need to know about the Cloudera Data Flow and their Public Cloud. Unfortunately I am still learning about what Cloudera offers and how, so I am not the best one to answer your question. If you are going to use NiFi for some real data processing, I strongly recommend you to have a look to Cloudera Data Flow, as this will solve many issues and headaches 🙂
... View more
05-04-2023
12:57 AM
@ushasri, What do you mean with licensed version of NiFi? In all my experience with NiFi I never heard of licensed version of NiFi, as this is an open source tool. Are you talking maybe about the Cloudera Version, which is somehow different as it is part of the Cloudera Ecosystem? When it comes to the export, you select your entire canvas (or a process group / or a group of multiple processors), create a template out of it, go within NiFi's Menu - Templates and download the newly created template. Afterwards, you can import that template in your new NiFi Instance and start playing with it.
... View more
05-04-2023
12:44 AM
Assuming that you are running on Linux, you need to find your operating system logs. In most linux distribution, those logs are to be found in /var/log. Now, every sysadmin configures each server by your company rules and requirements so I suggest you speak with the team responsible for the linux server and ask them to provide you with the logs. In these logs, you might find out why you are receiving that error in the first place. Unfortunately, this problem is not really related to NiFi, but to your infrastructure or to how somebody uses your NiFi instance. Somebody is doing something and you need to find out who and what 😞
... View more
05-04-2023
12:33 AM
1 Kudo
@danielhg1285, While the solution provided by @SAMSAL seems to be better for you and more production ready, you could also try the below things. This might work if you are using a stable statement all the time and if are not restricted to see the exact INSERT Statement but rather see the values trying to be inserted. - Shortly after RetryFlowFile, you can add an AttributesToJSON processor and manually define all the columns which you want to insert in the Attributes List Property. Make sure that you use the attribute name from your FlowFile (sql.args.N.value) in your correct order and you set Destination = flowfile-content. In this way, you will generate a JSON File with all the columns and all the values which you have tried to insert but failed. - After AttributesToJSON, you can keep your PutFile to save your file locally on your machine, hence opening it whenever and wherever you want 🙂 PS: This is maybe not the best solution, due to the following reasons, but it will get you started on your track: - You will need to know how many columns you have to insert and each time a new column will be added you will have to modify your AttributesToJSON processor. - You will not get the exact SQL INSERT/UPDATE Statement, but a JSON File containing the column-value pair, which can easily be analyzed by anybody.
... View more
05-03-2023
10:08 AM
@srv1009, In my overall experience with NiFi, the problem you have reported only happened because of one of the following two reasons: - as you already mentioned, an abrupt NiFi shutdown --> which of course might cause your corrupted file. This can be solved easily, you stop doing such shutdowns 🙂 - the physical disk on which NiFi is located gets corrupted and implicitly it starts corrupting other files as well. This has an easy solution as well --> you test your hardware health and if necessary you change it. Nevertheless, the reason for why this happens is present either in the NiFi logs or the server logs 🙂
... View more
04-28-2023
06:56 AM
hi @Amit_barnwal, First of all, why are you using the java.arg.2and java.arg.3= with such a big difference? You know that Xms = initial memory allocation and Xmx = maximum memory allocation... and all of them refer to the HEAP memory. In addition, the minimum recommended size for your heap is 4GB. Have a look on the following Article, which provides you everything you need in terms of configuring the nifi cluster: https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999 Now, regarding the fact that you NiFi Instance is eating a lot of RAM Memory, you need to know that most of the processors divide into two categories: RAM eating processors and CPU eating processors. If in your workflow you have many RAM eating processors it is normal that you eat a lot of the available RAM Memory. PS: it is not really recommended to assign so much memory to your heap 🙂
... View more
04-28-2023
02:21 AM
@Vas_R, How would you like to encrypt that data, as I see you want to do this on each column and each row? Have you something in mind or are you searching for something built in NiFi directly? If you want something directly in NiFi, I do not think that you will find anything that specific. You could try exporting your relevant data into attributes and use something like CryptographicHashAttribute Processor in order to use a hash algorithm on those attributes. Next, you can use an AttributesToCSV/AttributesToJSON Processor and generate a new FlowFile with the hashed data. If CSV or JSON is not the best format for you, you can add an extra ConvertRecord and transform your data in whatever format you want. But be careful as this solution will require many resources if you are playing with large amounts of data. Another solution would be to find an encryption algorithm and implement it within a Script. Add an ExecuteStreamCommand Processor, which will read the AVRO File, perform the encryption and write out the newly generated AVRO File.
... View more
04-28-2023
12:02 AM
@luv4diamonds On my instances where I am running the embedded zookeeper, I am using port 2182 and not 2181. Maybe you can try and change it like that and test. In addition, I assume that you have the file myid generated on every node, right?(https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.1/nifi-state-management/content/embedded_zookeeper.html)
... View more
04-27-2023
10:21 AM
@AntonBV, You could give it a try with PartitionRecord, which will place the results of a RecordPath directly into a FlowFile Attribute. I am using it already on AVRO data so it should work for you as well. PartitionRecord: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.20.0/org.apache.nifi.processors.standard.PartitionRecord/index.html Record Path: https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
... View more
04-27-2023
01:14 AM
Add a UpdateAttribute in front of PutHDFS and use NEL to rename your file from ${filename} to ${filename}.parquet and then save it into HDFS wherever you want.
... View more