Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi executeprocess does not generate realtime data

avatar
Contributor

Hi, i have questions. I'm followed the REALTIME EVENT PROCESSING IN HADOOP WITH NIFI, KAFKA AND STORM and got stuck. Why execute process in Nifi does not generate realtime data although i can execute /root/iot-truck-streaming/stream-simulator/generate.sh from shell. I used the template in that tutorial and also i followed that tutorial.

Another is why i use template from ANALYZE TRAFFIC PATTERNS WITH APACHE NIFI . I can run it. But i don't see the exact output file in output folder. I just can see "trafficLocs_data_for_simulator.zip_raw=true" file in output folder although i have waited for long time.

Thanks!

1 ACCEPTED SOLUTION

avatar

Hello,

Sorry to hear that you're having trouble with our tutorial. I think there must be some error happening when the ExecuteProcess runs the shell script. But the processor ignores the stderr stream of the process by default, and currently, no bulletin error or log message are shown, then it's difficult to investigate what went wrong.

I think there's some room for improvement for this, and going to look further. In a meanwhile, please use 'Redirect Error Stream' to capture error output of the process to see what is happening, as shown in the attached image.

Thanks!

8560-execute-process.png

View solution in original post

17 REPLIES 17

avatar

Hello,

I followed the tutorial once again and found what caused this.

The tutorial instructs to install NiFi via Ambari. Then the NiFi process is started by 'nifi' user on sandbox.

On the other hand, it instructs to clone the 'iot-truck-streaming' project from github, under 'root' home directory (/root).

So, even if we add execute permission to the sh file, nifi user can't access anything under /root directory.

Copying /root/iot-truck-streaming dir to /home/nifi would be a work-around:

# ssh into sandbox with root user, then:
cp -rp /root/iot-truck-streaming /home/nifi

Then, specify:

/home/nifi/iot-truck-streaming/stream-simulator/generate.sh

as 'Command Arguments' on NiFi.

avatar
New Contributor

thanks it worked!!

avatar
Expert Contributor

Thanks and it helps. Appreciate your support.

avatar
New Contributor

@kkawamura i have run

  1. cp -rp /root/iot-truck-streaming /home/nifi and changed the command arg but still after completion of storm deployment there is no data emitted from kafka spout maybe because nifi does not ingest data in kafka. pls help i m stuck on it

avatar
New Contributor

@kkawamura i have referred this link https://github.com/hortonworks-gallery/tutorials/blob/master/2015-09-25-processing-real-time-event-s... for working on kafka and storm and for ingestion in hbase and hive followed following steps of link http://saptak.in/writing/2015/09/24/ingesting-real-time-streams-hive-hbase

avatar

@fazila

Would you share NiFi UI screenshot around data ingestion part? Also, please try the steps explained at the answer on this thread above, "Redirect Error Stream' to capture error output of the process to see what is happening, as shown in the attached image." to see if the generate.sh have issue.

avatar
New Contributor

@kkawamura my nifi service home path is /home/nifi-0.6.0.1.2.0.1-1 and i have pasted iot-streaming in /home/nifi this is shown

14168-nifi-ui.png

and also when i gave command arg in execute process /root/iot-truck-streaming, i m able to see nifi data in /root/nifi_output/truck_events. execute-process.png

avatar

@fazila

when i gave command arg in execute process /root/iot-truck-streaming, i m able to see nifi data in /root/nifi_output/truck_events

Does this mean ExecuteProcess was able to run the shell command but output data is written in /root/iot-truck-streaming directory?

The generate shell doesn't need an argument. I haven't tried to pass an argument but IIRC it generates data to its standard output, and NiFi ExecuteProcess will read data from that standard output, then create FlowFiles containing the generated data.

No FlowFile was generated when you only specify the shell path at 'Command Arguments' processor property?

Also, just in case, since 'Batch Duration' is 10 sec, you need to leave ExecuteProcess running at least for 10 sec to see generated FlowFile.