Member since
07-29-2020
558
Posts
307
Kudos Received
167
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
112 | 11-28-2024 06:07 AM | |
78 | 11-25-2024 09:21 AM | |
213 | 11-22-2024 03:12 AM | |
116 | 11-20-2024 09:03 AM | |
314 | 10-29-2024 03:05 AM |
12-04-2024
06:51 AM
Hi @DuyChan , Have you tried using DistributedMapCacheClientService & DistributedMapCacheClientServer instead. Im not sure what is the difference with the MapCacheClientService but it should do the same job. Also be aware because of size limitation Nifi is not being published with all packages and in case you dont find services or processors that should be part of nifi , you probably need to download jar\nar pacakge from maven repositories and save to the Nifi Lib Folder: https://mvnrepository.com/artifact/org.apache.nifi/nifi-hazelcast-services-api-nar/2.0.0 https://mvnrepository.com/artifact/org.apache.nifi/nifi-distributed-cache-client-service-api If that helps please accept the solution. Thanks
... View more
12-03-2024
06:01 AM
2 Kudos
Hi @SS_Jin , Glad to hear that my post helped. Its really hard to suggest something specially when I dont have all the details of what you are trying to do but from what I read, I think Join\Fork Enrichment would work better in these scenarios. The mergerecord way could be problematic when you are reading multiple sources and multiple CSVs where merge behavior can be unpredictable. Also depending on what type of enrichment you are trying to do and how complex its but if you have one to one mapping between record in the DB vs CSV and you are trying to override some data or add new one then you might also consider the LookupRecord processor to simplify your data flow where you dont have to use branching to read and then merge the different sources which might endup saving you some overhead. https://community.cloudera.com/t5/Community-Articles/Data-flow-enrichment-with-NiFi-part-1-LookupRecord-processor/ta-p/246940
... View more
12-03-2024
04:36 AM
1 Kudo
Hi @Mikhai , Its hard to say what is going on without looking at the data itself or seeing the ExcelReader Configuration. I know providing the data is not easy but if you can replicate the issue using dummy data then please share. Also if you can provide more details on how you configured the ExcelReader, for example are you using custom schema or infering the schema? I would try the following: 1- Try to find table boundary in excel and delete empty rows. If you cant then for sake of testing copy the table with the rows you need into new excel and see if that works. 2- If ExcelReader works with 545 rows , then I will try and provide custom schema - if not provided - and try to set some of the fields where there should be a value to not allow null. Maybe by doing so it will help the ExcelReader not to import empty rows. I tried to use ExcelReader before but ran into issues when the excel has some formula columns because of a bug in the reader itself. Im not sure if those issues were addressed but as workaround I used Python Extension to develop custom processor that takes excel input and convert into Json using Pandas library. This might be an option to consider if you are still having problems with the ExcelReader service but you have to use Nifi 2.0 version in order to use python extension. If that helps please accept the solution, Thanks
... View more
12-03-2024
12:33 AM
1 Kudo
Hi @Emery , Unfortunately no I have not been able to do it and if you are using windows docker desktop I don't think it can be done. One way around it is to use Nginx Reverse proxy but it's not easy process to follow and I wasn't able to implement either. If you are ever able to get it working please do share your findings.
... View more
11-30-2024
11:42 AM
1 Kudo
Hi @Vikas-Nifi , I think can avoid a lot of overhead such as writing the data to the DB for just doing the transformation and assigning the fixed width (unless you need to store the data in the DB). You can use processors like QueryRecord, UpdateRecord to do the needed transformation of data in bulk vs one record at a time and one field at a time. In QueryRecord you can use SQL like function based on apache calcite sql syntax to make transformation or derive new columns just as if you are doing mysql query. UpadateRecord also you can use Nifi Record Path to traverse fields and apply functions in bulk vs one record at a time. There is also a FreeFormTextRecordSetWriter service that you can use to create custom format as an output. For example in the following dataflow, Im using ConvertRecord process with CSVReader and FreeFormTextRecordSetWriter to produce desired out: The GenerateFlowFile processor is used to create the input CSV in flowfile: The ConvertRecord is configured as follows: The CSVReader you can use default configuration. The FreeFormTextRecordSetWriter is configured as follows: In the Text Property you can use the columns\fields names as listed in the input and provided to the reader . You can also use Nifi Expression Language to do proper formatting and transformation to the written data as follows: ${DATE:replace('-',''):append(${CARD_TYPE}):padRight(28,' ')}${CUST_NAME:padRight(20,' ')}${PAYMENT_AMOUNT:padRight(10,' ')}${PAYMENT_TYPE:padRight(10,' ')} This will produce the following output: 20241129Visa Test1 0.01 Credit Card
20241129Master Test2 10.0 Credit Card
20241129American Express Test3 500.0 Credit Card I know this not 100% what you need but it should give you an idea what you need to do to get the desired output. Hope that helps and if it does, please accept the solution. Let me know if you have any other questions. Thanks
... View more
11-28-2024
06:22 AM
Hi, Its seems like you are running out of heap memory when adding new attributes through the evaluateJsonPath processor. Attributes are stored in the heap and you should be aware not to store large data in flowfile attributes if you are going to have so many flowfiles in order to avoid running into such issue. Can you please elaborate on what are you trying to accomplish after converting Avro To Json? to me it doesnt make sense what you are doing because you are merging towards the end which means you might not even get the attribute you are extracting depending on how you set the Attribute Strategy in the MergeRecord processor.
... View more
11-28-2024
06:07 AM
Hi, First, if the data you have posted contain real personal info I would recommend to remove and use some dummy data instead. Its violation of community guidelines to post personal information (see point 7 of community guidelines). In regards to the error: you are getting it because of the property setting Quote Character = " in the CSVReader service. What this setting means is that when you have sentence that has once of the reserved CSV characters like comma (,) as column separator and new line (\n) to separate records where you dont\cant use the escape character (\), then you can surround the whole column value with double quotes at both ends. This means you should not have any following character for the same column. For more info please refer to : https://csv-loader.com/csv-guide/why-quotation-marks-are-used-in-csv Since the line you have listed has following characters after the closing " , you are getting the illegal character error. To Resolve: You have two options: 1- Use Replace Text to replace any double quote " character with \" to escape the double quote. However this might not be so efficient if you have large CSV file. 2- More efficient option, is to replace the Quote Character in the CSVReader with something other than " , however you have to make sure that your data is not going to contain the new character in any of the CSV values. Possible options: $,%,^ If this helps please accept the solution. Thanks
... View more
11-27-2024
11:12 AM
Sure, If you come up with a solution different than what I suggested please do post about it so it can help others who might run into similar situation. good luck
... View more
11-27-2024
07:18 AM
Hi, It still not clear to me what is exactly happening and where. The error message states a field called ecoTaxValues which doesnt seem to exist in the provided input. You also mentioned that you are using ConsumeKafka and getting an error there through reader\write while the consumeKafka processor doesnt take any reader\writer service. The consumeKafkaRecord does....is that what you are using? Please be specific when describing the problem as much as you can. If you cant share the information for security reason then I would recommend you try to reproduce using sample data and dataflow to make it easier to isolate the error. Also please share screenshot\accurate description of the dataflow since the inception of the input and share the processor configurations as well as any services that are being used.
... View more
11-26-2024
02:03 PM
2 Kudos
It seems like whenever dealing with parquet reader\writer services , those services are trying to use Avro schema, possibly to make sense of the data when passing it along to the target processors ( like PutDatabaseRecord ) since parquet is in binary format. The problem with this is that Avro has limitation on how fields should be called. Actually this is reported as a bug in Jira but it doesnt seem to have been resolved. According to the ticket Avro fields should only start with the following characters [A-Za-z_] . Given this , it seems you have to think of some workaround to address this issue since Nifi doesnt provide a solution out of the box. you can check my answer to this post as an option. Basically, you can use python to read the parquet content and transfer to another format (such as CSV as an example) then pass the CSV to the PutDatabaseRecord. This should work as I have tested it. Since you seem to be using Nifi 2.0 , you can develop python extension processor for this instead of ExecuteStreamCommand mentioned in the post. Hope that helps. If it does, please accept the solution. Thanks
... View more