We noticed an on-going issue at times, where Impala queries will be receive this type of message:
Query Status: Couldn't open transport for <hostname>:22000 (SSL_connect: Connection reset by peer)
We are running CDH 5.7.3 and Impalad verison is 2.5.0.
When we see this, i can look at the webui for the impalad host and I usually see a query that in the "CREATED" state, but is not running...and typically these queries are from days before.
I also notice that the Last Event will indicate something like "Ready to start 47 remote fragments". I try to cancel (esp if the query is 2 or 3 days old) and i cannot cancel it and get this message: Error: Query not yet running
Seems the only way to clear the query is to reset the Impalad node. That seems like bad way to resolve this issue.
Has anyone faced this issue before and have any thoughts/suggestions?