Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Reformat text file through NiFi

Highlighted

Reformat text file through NiFi

Hello guys

am trying through NiFi to process a file and reformat its records, I used awk script through ExecuteStreamCommand, for some reason, not all files are processed by ExecuteStreamCommand processor, some of the output files size is 0

what is required,

e.g.

File content

AAAA,XXXX,YYYYY,ZZZZZ

SSSS,EEEE,FFFFF,GGGG

.....

if record number 2 starts with 0[123456789], replace 0 with 009

if record number 2 starts with 33300, remove 333

if record number 2 starts with 00, print as is.

can I achieve this though any other processor ?

9792-untitled.png

@Matt

3 REPLIES 3
Highlighted

Re: Reformat text file through NiFi

Super Guru
@Yahya Najjar

Notice the red symbol on your "ExecuteStreamCommand" processor. Place your mouse on that red box on top right corner and it will show you what error you are running into. share that error so we can resolve the issue.

Highlighted

Re: Reformat text file through NiFi

Thanks mqureshi...

am processing around 100K files, around 20 files have issue.. 0 size

9793-11111.png

9794-222222.png

Highlighted

Re: Reformat text file through NiFi

Hello guys..

I'm trying to change to "ExecuteScript" using Groovy, I took sample code from Matt blog but still not able to put the above conditions within the code..

-----

File content

AAAA,XXXX,YYYYY,ZZZZZ

SSSS,EEEE,FFFFF,GGGG

.....

if record number 2 starts with 0[123456789], replace 0 with 009

if record number 2 starts with 33300, remove 333

if record number 2 starts with 00, print as is.

----------

plz help

import java.nio.charset.StandardCharsets

def flowFile = session.get()
if(!flowFile) return

flowFile = session.write(flowFile, {inputStream, outputStream ->
   inputStream.eachLine { line ->
   a = line.tokenize('|')
   outputStream.write("${a[0]},${a[1]},${a[2]},${a[3]}\n".toString().getBytes(StandardCharsets.UTF_8))
   }
} as StreamCallback)
 

session.transfer(flowFile, REL_SUCCESS)

Don't have an account?
Coming from Hortonworks? Activate your account here