Support Questions

Find answers, ask questions, and share your expertise

Execute FLow in Shell Script NIFI

Hello,

I need execute shell script for load tables in hive.

How i can execute the scripts as flow.

Examples

Execute shell A and B in parallel. After complete invoke shell C.

Thanks.

1 REPLY 1

Super Guru

PutHiveQL (and PutHive3QL for Hive 3) allows you to execute HiveQL statements without needing a shell or external program to execute. However this use case appears to be more job-oriented (or at least Bulk Synchronous Parallel (BSP) oriented) than flow-oriented, the latter of which is NiFi's focus.

Having said that, there are some processors that can help with this pattern, namely Wait/Notify, but currently they work in a concurrent model, not a parallel one (i.e. the state is stored locally, not for the whole cluster). I don't believe there is currently a mechanism for merging flow files that have been distributed amongst a cluster, but your example wouldn't really need all that except to execute A and B in parallel (vs concurrently, and possibly parallel on different cores, but not guaranteed). If you're willing to give up parallelism, you could split ABC into A,B,C with a FIFO prioritizer into PutHiveQL, then just wait for A,B,C to be executed sequentially. Otherwise I think the solution would get more complicated than you'd want to deal with for this use case. Just my two cents 🙂