I have a Hive query (coming from Nifi), it is an insert operation, and Tez (view from Ambari) has reported it running for >48 hours. This was preventing other operations from occuring, so I wanted to kill the query so the backlog could be processed.
I killed the associated Yarn Application. The Tez view from Ambari still shows the status as Running. Clicking through to the Application ID shows the status as Killed and Final Status as Undefined. Clicking through to the DAG shows the status as Killed, and the only listed Vertex is also Killed. Total Tasks for the DAG is 1.
However, despite all this the query is still marked as running in TEZ and other queries are blocked behind it. Even non-modifying queries like a simple SELECT COUNT(*) are unable to run. This has persisted across Yarn, Hive, Tez, and even entire machine restarts.
Is there some way to kill off the operations that Tez is showing? Killing the yarn application doesn't seem to be sufficient.
This board does not seem to be very helpful to anyone experiencing issues with the Hortonworks tutorials. On a few occasions I have received meaningful assistance from others, but most of the issues raised receive only a cursory response if any.
This I believe, is in correlation to the following bug Link which was corrected in specific versions of Tez.
If you are running Interactive query HDInsight in Azure, they are still using Tez .7.0 (not fixed).
I did however, find this blog post that explains how to at least stop the queries from running. As far as I can tell, Tez will still show them as running, but it does free up your resources again.
Edit: the follow up link is probably helpful 🙂 Fix