Support Questions

naga_satish · ‎11-27-2020

I need to cleanse this data and the needs to look like this.

I tried using ReplaceText with these configurations.
search value - ^"(.*)"$ and Replacement Value - $1. But these configurations is not working and the file is routing to failure. not sure what might be the issue.

open to other suggestions. Thanks in advance.

stevenmatison · ‎11-28-2020

@naga_satish I think the solution you will find easiest is to use a CSV record reader and writer. In the reader you tell schema/settings it is pipe delimited, and escaped with quotes. In the writer you choose same schema no quotes. It should be straight forward use case for RecordReaders in UpdateRecord/QueryRecord/LookupRecord or other RecordReader based processors.

If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.

Thanks,

Steven

naga_satish · ‎11-30-2020

@stevenmatison I already tried using convert record processor with the configurations that you mentioned. I use InferSchema as schema strategy. this configuration throughs an error. It says index for header "Created_Date__c" is 1 but csv record have only one value. Not sure what is this error.

When I set schema strategy as "use string fields from header" entire data is messed up.

When I set schema strategy as schema text everything is empty, just delimiters and header is present.

stevenmatison · ‎11-30-2020

You will need to define your schema in avro format, drop that into the readers/writers. Here is an example:

{
   "type" : "record",
   "name" : "DailyCSV",
   "fields" : [
      { "name" : "DepartmentName" , "type" : ["string", "null"] },
	  { "name" : "AccountName" , "type" : ["string", "null"] },
	  { "name" : "AccountOwnerId", "type" : ["string", "null"] },
      { "name" : "AdditionalInfo", "type" : [ "null", "string" ] }
]
}

naga_satish · ‎11-30-2020

@stevenmatison That's precisely what I did. Every column in processed file is empty. just header name and delimiters are present.

stephane_davy · ‎11-30-2020

Hello,

Can you show the config of your CSV recordReader processor? Your CSV looks nice, there is no need to replace anything here.

In order to simplify the debug, you can also select "Infer schema" instead of using a Avro schema. It's of course better to work with Avro schema when you go to production

stevenmatison · ‎11-30-2020

As suggested above, update post with your processor, its reader and writier settings. It sounds like you have something misconfigured. If possible show us a screen shot of your flow too.

Cloudera Community

Support Questions

How to extract fields enveloped in quotes in NiFi?