Following the documentation here to setup and close a connection in python cdsw via impala.dbapi, I've found that the connection seems to remain open as I'm able to continue to use it for a query after I've run
connection.close()
What needs to be done to close the connection? The cluster admin insists that even when the session closes, the connections are remaining open and taking up resources.
here's the test:
#Python 2
from impala.dbapi import connect
import pandas as pd
#### Set up impala connection in conn_imp
conn_imp = connect(host='phxhadoopp08.swift.com',port = 21050,auth_mechanism = 'GSSAPI')
#### Read sql from conn_imp
df = pd.read_sql('Select * FROM prod_ba.dart_pred_hst Limit 10',conn_imp)
print df.shape
#### Close Connection
conn_imp.close()
#### Read sql from conn_imp to see if closed
df = pd.read_sql('Select * FROM prod_ba.dart_pred_hst Limit 10',conn_imp)
print df.shape
and here's a screen shot of the results: