Member since 
    
	
		
		
		07-29-2020
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                574
            
            
                Posts
            
        
                323
            
            
                Kudos Received
            
        
                176
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2088 | 12-20-2024 05:49 AM | |
| 2368 | 12-19-2024 08:33 PM | |
| 2134 | 12-19-2024 06:48 AM | |
| 1411 | 12-17-2024 12:56 PM | |
| 2009 | 12-16-2024 04:38 AM | 
			
    
	
		
		
		11-30-2024
	
		
		11:42 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @Vikas-Nifi ,  I think can avoid a lot of overhead such as writing the data to the DB for just doing the transformation and assigning the fixed width (unless you need to store the data in the DB). You can use processors like QueryRecord, UpdateRecord  to do the needed transformation of data in bulk vs one record at a time and one field at a time. In QueryRecord you can use SQL like function based on apache calcite sql syntax to make transformation or derive new columns just as if you are doing mysql query. UpadateRecord also you can use Nifi Record Path to traverse fields and apply functions in bulk vs one record at a time. There is also a FreeFormTextRecordSetWriter service that you can use to create custom format as an output. For example in the following dataflow, Im using ConvertRecord process with CSVReader and FreeFormTextRecordSetWriter  to produce desired out:      The GenerateFlowFile processor is used to create the input CSV in flowfile:      The ConvertRecord is configured as follows:      The CSVReader you can use default configuration.  The FreeFormTextRecordSetWriter is configured as follows:      In the Text Property you can use the columns\fields names as listed in the input and provided to the reader . You can also use Nifi Expression Language to do proper formatting and transformation to  the written data as follows:  ${DATE:replace('-',''):append(${CARD_TYPE}):padRight(28,' ')}${CUST_NAME:padRight(20,' ')}${PAYMENT_AMOUNT:padRight(10,' ')}${PAYMENT_TYPE:padRight(10,' ')}  This will produce the following output:  20241129Visa                Test1               0.01      Credit Card
20241129Master              Test2               10.0      Credit Card
20241129American Express    Test3               500.0     Credit Card  I know this not 100% what you need but it should give you an idea what you need to do to get the desired output.  Hope that helps and if it does, please accept the solution.  Let me know if you have any other questions.  Thanks    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-28-2024
	
		
		06:22 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  Its seems like you are running out of heap memory when adding new attributes through the evaluateJsonPath processor. Attributes are stored in the heap and you should be aware not to store large data in flowfile attributes if you are going to have so many flowfiles in order to avoid running into such issue.  Can you please elaborate on what are you trying to accomplish after converting Avro To Json? to me it doesnt make sense what you are doing because you are  merging towards the end which means  you might not even get the attribute you are extracting depending on how you set the Attribute Strategy in the MergeRecord processor. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-28-2024
	
		
		06:07 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  First, if the data you have posted contain real personal info I would recommend to remove and use some dummy data instead. Its violation of community guidelines to post personal information (see point 7 of community guidelines).  In regards to the error: you are getting it because of the property setting Quote Character = "  in the CSVReader service. What this setting means is that when you have sentence that has once of the reserved CSV characters  like comma (,) as column separator and  new line (\n) to separate records  where you dont\cant use the escape character (\), then you can surround the whole column value with double quotes at both ends. This means you should not have any following character for the same column. For more info please refer to :  https://csv-loader.com/csv-guide/why-quotation-marks-are-used-in-csv  Since the line you have listed has following characters after the closing " , you are getting   the illegal character error.  To Resolve:  You have two options:  1- Use Replace Text to replace any double quote " character with \" to escape the double quote. However this might not be so efficient if you have large CSV file.      2- More efficient option, is to replace the Quote Character in the CSVReader with something other than " , however you have to make sure that your data is not going to contain the new character in any of the CSV values. Possible options: $,%,^  If this helps please accept the solution.  Thanks    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-27-2024
	
		
		11:12 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Sure, If you come up with a solution different than what I suggested please do post about it so it can help others who might run into similar situation. good luck 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-27-2024
	
		
		07:18 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,  It still not clear to me what is exactly happening and where. The error message states a field called ecoTaxValues  which doesnt seem to exist in the provided input. You also mentioned that you are using ConsumeKafka and getting an error there through reader\write while the consumeKafka processor doesnt take any reader\writer service. The consumeKafkaRecord does....is that what you are using? Please be specific when describing the problem as much as you can. If you cant share the information for security reason then I would recommend you try to reproduce using sample data and dataflow to make it easier to isolate the error. Also please share screenshot\accurate description of the dataflow since the inception of the input and share the processor configurations as well as any services that are being used. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-26-2024
	
		
		02:03 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 It seems like whenever dealing with parquet reader\writer services , those services are trying to use Avro schema, possibly to make sense of the data   when passing it along to the target processors ( like PutDatabaseRecord ) since parquet is in binary format. The problem with this is that Avro has limitation on how fields should be called. Actually this is reported as a bug in Jira but it doesnt seem to have been resolved. According to the ticket Avro fields should only start with the following characters [A-Za-z_] . Given this , it seems you have to think of some workaround to address this issue since Nifi doesnt provide a solution out of the box. you can check my answer to this post as an option. Basically, you can use python to read the parquet content and transfer to another format (such as CSV as an example) then pass the CSV to the PutDatabaseRecord. This should work as I have tested it. Since you seem to be using Nifi 2.0 , you can develop python extension processor  for this instead of ExecuteStreamCommand mentioned in the post.  Hope that helps. If it does, please accept the solution.  Thanks          
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-26-2024
	
		
		11:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Can you provide more information on your dataflow ? let's say you are using GenerateFlowFile to create the json Kafka output, what happens next? How are you enriching the data and what kind of processor where you are using the json reader\writer service that is causing the error? I need to see the full picture here because When I use same  json you provided in GenerateFlowFile processor and then passed it to QueryRecord with the same Json reader\writer service configuration, it seems to be working! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-26-2024
	
		
		10:57 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @PradNiFi1236 ,  How are you adding the new fields? You Json appears to be invalid as provided. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-26-2024
	
		
		10:52 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi ,  Can you provide more explanation\screenshot of your dataflow and the configuration set on each processor\controller service? Also if you can provide sample data that can be converted to parquet which can then reproduce the error that would be helpful as well.  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-25-2024
	
		
		09:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi ,  I dont see a function toNumber in the record path syntax , so Im not sure how did you come up with this. It would be helpful next time if you provide the following information:    1- input format.  2- screenshot of the processor configuration causing the error.     As for your problem , the easiest and more efficient way - than splitting records- I can think of is  using the QueryRecrod processor. lets assume you have the following csv input:     id,date_time
1234,2024-11-24 19:43:17
5678,2024-11-24 01:10:10        You can pass the input to the QueryRecord Processor with the following config:      The query above is added as a dynamic property which will expose new relationship with the property name that you can use to get the desired output. The query syntax is the following:     select id,TIMESTAMPADD(HOUR, -3,date_time) as date_time from flowfile        The trick for this to work is how you configure the CSV Reader and Writer to set the expectation on how to parse datetime fields in the reader\writer services:  For the CSVReader, Make sure to set the following:      CSVRecordSetWriter:         Output through Result relationship:     id,date_time
1234,2024-11-24 16:43:17
5678,2024-11-23 22:10:10        Hope that helps. If it does, please accept solution.  Thanks    
						
					
					... View more