Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Scheduling ExecuteStreamCommand processor with GenerateFlowFile to run every minute

avatar
Contributor

I want to use ExecuteStreamCommand to submit a spark job via the shell, and I want to use GenerateFlowFile so that I can detect the spark job failure and RouteOnAttribute as suggested by Matt's answer here.

I think it worked for detecting failure, but I can't make it scheduled correctly.

If i want the whole flow (generation of the flow file, the ExecuteStreamCommand and Routng) to be executed every 1 minute, should I schedule the GenerateFlowFile every 1 minute and leave the ExecuteStreamCommand as default (0 schedule) or should I schedule both.

I tried different combinations but it didn't work properly, I think the GenerateFlowFile keeps generating flow files but the ExecuteStreamCommand don't run multiple times.


another problem is that when I stop the ExecuteStreamCommand processor, it gets stuck, I can't change its configuration and I can't stop or start it again, It didn't work again until I restart NiFi.

Please help.

20 REPLIES 20

avatar
@Mahmoud Yusuf

I thought you said it was working with the ExecuteProcess processor?

If so, then you are able to move forward with flow design correct?

avatar
@Mahmoud Yusuf

Does the spark job generate a lot of output? Maybe you can suppress some of the output?

avatar
Contributor

@Wynner It's a spark job, it takes approx. 1 minute and it terminates when I run it outside NiFi. What's the problem ?

avatar
Contributor

The command is

spark-submit <path to jar>

avatar
Contributor

@Wynner I increased the concurrent tasks up to 3. but still having the same problem in different manner. If 3 tasks completed successfully it'll stuck.

It seems like 'Spark-submit' command isn't terminated, even if it's terminated and its output is present in the file system.
Do you know why this could happen?

avatar
Contributor

@Wynner
I figured out what is the problem, the spark job gets stuck when using "ExecuteStreamCommand" at some task,
When I run the same command from "ExecuteProcess" or from the shell myself the job terminates successfully.

I don't know what is the problem with "ExecuteStreamCommand"

https://community.hortonworks.com/questions/135430/spark-job-gets-stuck-when-submitted-from-nifi.htm...

avatar
Contributor
@Wynner

please check my last comment, I figured more details of the problem, but it still exists.

avatar
Contributor

@Wynner

It works with executeProcess but I don't want to use this processor. I use ExecuteStreamCommand so that I can route on attribute to detect failure. unfortunately the job gets stuck with it.

avatar
New Contributor

I have the same problem on the ExecuteStreamProcess, which launch python code.

I don't understand if is a bug of Nifi 1.3.0 or not. @Mahmoud Yusuf Which version do you have?

I debug the Nifi logs( nifi-app.log, nifi-bootstrap.log and nifi-user.log ) but I haven't noticed strange things

How we can debug processor behavior?

avatar
New Contributor

If I try to Put in RUNNING state, with Nifi API REST, the stucked processor i get this response:

2017-10-27 08:50:43,749 INFO [NiFi Web Server-96840] o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException: 015e1005-8820-176e-f509-ca592def60b0 cannot be started because it is not stopped. Current state is STOPPING. Returning Conflict response.