Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

PutHiveQL Data Provenance Expire

New Contributor

cuserse482631desktopcapture.jpgI have a ReplaceText -> PutHiveQL process. Once it get to PutHiveQL, nothing happen. The Data Provenance type show as EXPIRE. Please advise on why this is happening. Thanks for your time.

64629-hxs0u.png

2 REPLIES 2

Super Guru
@Jodie Tan

EXPIRE Indicates a provenance event for the conclusion of an object’s life due to the object not being processed in a timely manner.

If your flowfile got routed to Retry relation then in Data Provenance will show those flowfiles are EXPIRE type because the object is not being processed timely manner it has routed to retry relationship.

Please refer to below link to get more details about all event types of Dataprovenances

https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data_provenance

Example:-

I'm having ReplaceText processor having simple insert statement

Case1:-

insert into default.existed-table values(1,'a','b');

In this case we are able to run the above insert statement and Data Provenance type would be SEND.

Case2:-

if you run same insert into statement with some non existing table then the flowfile will be routed to Retry relationship and the Data Provenance type would be EXPIRE.

insert into default.non-exist-table values(1,'a','b');

62933-puthiveql.png

as we are having expire type(which is routed to retry relationship), send type(when we able to execute the query).

.

If the Answer helped to resolve your issue, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.

@Jodie Tan A flow file which can't be processed by the NiFi flow is marked as EXPIRED. And it can be set to either some specific value for the relations/connections or can be automatically marked as EXPIRED after X amount of time if they fail to process.

Since you mentioned that you are able to have those flow files into the PutHiveQL processor but nothing happens after that. There can be multiple reasons for that. Documenting a couple of them.

1. If you have a multi-node NiFi cluster, make sure your PutHiveQL processor is running on "All Nodes". You can verify by Right Click on the Processor -> Configure -> Scheduling Tab. This may cause issues if your flow files are generated on let's say machine B but your PutHiveQL is running only on your Primary Node, which let's say is A.

2. Is your connection working fine? Can you see anything in the "Bulletin"? "Bulletin" is the "error log" displayed when you hover on the red rectangular icon on top right of the processor if any. A possibility is you have failures attempting the HQL execution and you have redirected the "retry" back to the processor.

PS - Can you please share some more details on the "fate" of your flow files arriving into your PutHiveQL processor? It may help further debugging, if needed.

Thanks!

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.