Support Questions

Find answers, ask questions, and share your expertise

Prioritize execution of .hql file and then .csv file in the order in distributed manner

avatar
Contributor

Hi,

 

We are processing ZIP file contains multiple timestamp files (.hiveql,.csv) in distributed manner .
We check the file extension whether it is .hql or .csv then we route the file to execute it PutHiveQL and PutHDFS processor respectively.


The files(timestamp order starts with for example t1 or system timestamp)  below contains in ZIP file to be extracted and processed in order.

table_info.zip

 

 

table_info_t1.hql
table_info_t1_1.csv
table_info_t1_2.csv
table_info_t2.hql
table_info_t2_1.csv
table_info_t2_2.csv
table_info_latest.hql
table_info_latest.csv

 

 

 

Please find the below NiFi flow and RouteonAttribute property

NiFiDistributedFiles.PNG

 

 

RouteOnFileExtension.PNG

Is there any way to make us to wait first puthivesql executes first and give indication to putHDFS execution next for each timestamp file one by one order.

 

Can we group each timestamp files into group and process the .hql file and the put .csv file into HDFS?

 

 

@Nifi

1 ACCEPTED SOLUTION

avatar
Contributor

We have solved this with help of wait and notify processor as we routed .hql file to puthql which interns routes success/failure to Notify . The Wait processor wait signal will release the .csv file to put into HDFS once Notify signal comes from Notify processor.

View solution in original post

1 REPLY 1

avatar
Contributor

We have solved this with help of wait and notify processor as we routed .hql file to puthql which interns routes success/failure to Notify . The Wait processor wait signal will release the .csv file to put into HDFS once Notify signal comes from Notify processor.