Created 08-29-2017 09:11 PM
I want to use ExecuteStreamCommand to submit a spark job via the shell, and I want to use GenerateFlowFile so that I can detect the spark job failure and RouteOnAttribute as suggested by Matt's answer here.
I think it worked for detecting failure, but I can't make it scheduled correctly.
If i want the whole flow (generation of the flow file, the ExecuteStreamCommand and Routng) to be executed every 1 minute, should I schedule the GenerateFlowFile every 1 minute and leave the ExecuteStreamCommand as default (0 schedule) or should I schedule both.
I tried different combinations but it didn't work properly, I think the GenerateFlowFile keeps generating flow files but the ExecuteStreamCommand don't run multiple times.
another problem is that when I stop the ExecuteStreamCommand processor, it gets stuck, I can't change its configuration and I can't stop or start it again, It didn't work again until I restart NiFi.
Please help.
Created 08-29-2017 09:20 PM
Have you tried setting the Scheduling Strategy using CRON driven in the flow? That way the processors should be synced when they run.
Created 08-29-2017 09:44 PM
I don't understand exactly what do you mean.
In a previous question the answer was :
"You could schedule a GenerateFlowFile at the same rate your ExecuteProcess was scheduled for, and set Ignore STDIN to true in ExecuteStreamCommand. Then the outgoing flow files will have the execution.status attribute set, which you can use with RouteOnAttribute to handle failures (non-zero exit codes, e.g.)"
I want to know how to schedule these 2 processors together so that the result is that the flow is executed every 1 minute.
Created on 08-29-2017 11:12 PM - edited 08-17-2019 05:44 PM
Run the processors like this. First processor, GenerateFlowFile, every minute of every hour
Then the next processor should run the first second of every minute of every hour
And then the last processor the second second of every minute of every hour
Do you follow?
Created on 08-30-2017 10:41 AM - edited 08-17-2019 05:44 PM
@Wynner Ok the schedule seems to be working, when the submitted job fails it works fine and the flow is ok.
once the job run without errors, flow files keeps generated every minute, but the ExecuteStreamCommand is stuck. I can't even stop or start it, I need to restart NiFi to run it again.
When I try to stop/start ExecuteStreamCommand it says: "No eligible components are selected. Please select the components to be stopped."
Created on 08-30-2017 11:01 AM - edited 08-17-2019 05:44 PM
@Wynner
Here's what I'm trying to illustrate:
one successful execution at "ExecuteStreamCommand" then it gets stuck (flow files keeps generated but ExecuteStream is stuck):
------
If no successful executions happens at all (All executions failed) the schedule works well as follows (flow files generated every minute, and executeStreamCommand executes every minute):
I don't know why it gets stuck in the first case ? please help.
Created 08-30-2017 12:22 PM
The reason you cannot stop the ExecuteStreamCommand processor, is that it still has a running thread. How long does it take to run your script outside of NiFi? It seems like the script is not finishing, so the ExecuteStreamCommand processor it just waiting.
Created 08-30-2017 12:51 PM
When you say about a minute, does that mean less than a minute or more than a minute? Why don't you try generating a flow file every 2 minutes and see if that works better? Or is it possible to run the script in parallel? Give the ExecuteStreamCommand processor 2 concurrent tasks instead of one.
Created 08-30-2017 03:30 PM
In my experience, if you aren't making a call to a system level command, then the processor does have an issue sometimes.
Try putting the actual "spark-submit <path to jar>" into a shell script and then call the shell script in the ExecuteStreamCommand processor. I have found that method more reliable.
Created 09-05-2017 02:33 PM
I'm glad it is working for you now.
Created 09-05-2017 02:55 PM
I thought you said it was working with the ExecuteProcess processor?
If so, then you are able to move forward with flow design correct?
Created 09-05-2017 03:27 PM
Does the spark job generate a lot of output? Maybe you can suppress some of the output?
Created 08-30-2017 12:38 PM
@Wynner It's a spark job, it takes approx. 1 minute and it terminates when I run it outside NiFi. What's the problem ?
Created 08-30-2017 01:20 PM
The command is
spark-submit <path to jar>
Created 08-30-2017 01:39 PM
@Wynner I increased the concurrent tasks up to 3. but still having the same problem in different manner. If 3 tasks completed successfully it'll stuck.
It seems like 'Spark-submit' command isn't terminated, even if it's terminated and its output is present in the file system.
Do you know why this could happen?
Created 09-05-2017 02:30 PM
@Wynner
I figured out what is the problem, the spark job gets stuck when using "ExecuteStreamCommand" at some task,
When I run the same command from "ExecuteProcess" or from the shell myself the job terminates successfully.
I don't know what is the problem with "ExecuteStreamCommand"
https://community.hortonworks.com/questions/135430/spark-job-gets-stuck-when-submitted-from-nifi.htm...
Created 09-05-2017 02:39 PM
please check my last comment, I figured more details of the problem, but it still exists.
Created 09-05-2017 03:14 PM
It works with executeProcess but I don't want to use this processor. I use ExecuteStreamCommand so that I can route on attribute to detect failure. unfortunately the job gets stuck with it.
Created 10-27-2017 08:50 AM
I have the same problem on the ExecuteStreamProcess, which launch python code.
I don't understand if is a bug of Nifi 1.3.0 or not. @Mahmoud Yusuf Which version do you have?
I debug the Nifi logs( nifi-app.log, nifi-bootstrap.log and nifi-user.log ) but I haven't noticed strange things
How we can debug processor behavior?
Created 10-27-2017 08:53 AM
If I try to Put in RUNNING state, with Nifi API REST, the stucked processor i get this response:
2017-10-27 08:50:43,749 INFO [NiFi Web Server-96840] o.a.n.w.a.c.IllegalStateExceptionMapper java.lang.IllegalStateException: 015e1005-8820-176e-f509-ca592def60b0 cannot be started because it is not stopped. Current state is STOPPING. Returning Conflict response.
Created 10-27-2017 12:36 PM
I put to false "Ignore STDIN" parameter and it doesn't give any error... I hope to have solve this problem.