Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Convert csv field into several fields with mechanism of records

Explorer

The data is in the below format:

1;p1:"123",p2:"234";r1

2;p1:"333",p2:"444",p3:"555";r2

3;p2:"666",p5:"888";r3

p1, p2, p3 and so on could be a great variety of params. So in the output I'd like to have a structure in csv of such kind:

1;p1;123;r1

1;p2;234;r1

2;p1;333;r2

2;p2;444;r2

2;p3;555;r2

3;p2;666;r3

3;p5;888;r3

How to implement it with standard record processors?

Thanks.

3 REPLIES 3

Expert Contributor

You could do this in NiFi using the following steps:

  1. Store the "1" and the "r1" in variables using the UpdateAttribute processor
  2. Substring the record after the first semicolon and before the last semicolon - ExtractText processor
  3. Use SplitText to split on every comma. You will now have 1 FlowFile for each param
  4. Re-structure the data as you need - add the "1" to the beginning and the "r1" to the end. This uses the ExtractText processor again

All of the above requires the use of the NiFi Expression Language, documented here: https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

Explorer

Thanks for the answer. But I got rid of extracttext cause I am dealing with thousands of rows. I 've been using it since 0.6.1 version until 1.2. Moreover I don't know how many params p1, p2 there could be and its order. So I consider using Updaterecord or maybe ForkRecord in some way. Any help is appreciated.

Explorer

Hello. I wonder if there are no ideas how to implement this work using Updaterecord?