- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Connections to CDP COD via python phoenixdb very slow
Created ‎05-12-2023 03:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've been testing the phoenixdb library in Python with CDP Operational DataStore cluster. Following the example here:
I can make a connection and return a single row but the connection is very slow to establish and execute. I'm running phoenixdb from the python interpreter on the COD cluster gateway host. The timestamped python execution is below.
Python 3.6.8 (default, Nov 16 2020, 16:55:22)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
>>> import phoenixdb
>>> from datetime import datetime
>>> opts = {}
>>> opts["authentication"] = "BASIC"
>>> opts["serialization"] = "PROTOBUF"
>>> opts["avatica_user"] = "workload-username"
>>> opts["avatica_password"] = "workload-password"
>>> jbdc_connection_url = "https://cod--<clusterID>-gateway0.<envID>.<custID>.cloudera.site/cod--<clusterID>/cdp-proxy-api/avatica/"
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:54:45
>>> conn = phoenixdb.connect(jbdc_connection_url, autocommit=True, **opts)
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:54:51
>>> sql_command = "SELECT * FROM DBNAME.TABLENAME WHERE CUSTID = ?"
>>> data = ("xxxxxxxx", )
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:54:51
>>> cursor = conn.cursor()
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:54:51
>>> cursor.execute(sql_command, data)
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:55:00
>>> results_list = cursor.fetchall()
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:55:00
>>> conn.close()
>>> print(datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
2023-05-12 09:55:06
>>> results_list
[['xxxxxxxx', '...', '...', ...]]
You can see that it takes 6 seconds to establish the connection to Phoenix. It takes 9 second to execute the SQL command and another 6 seconds to close the connection.
Any thoughts on why this might be the case and if I can make changes to address this? Thanks.
Created ‎05-17-2023 08:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @stephen_obrien As discussed via Cloudera case, there is a performance bottleneck when connecting via knox ( tracked in internal jira ) than directly from phoenix-sqlline from the edge node.
You can test the same when the runtime version 7.2.17 is released.
Created ‎05-17-2023 08:23 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @stephen_obrien As discussed via Cloudera case, there is a performance bottleneck when connecting via knox ( tracked in internal jira ) than directly from phoenix-sqlline from the edge node.
You can test the same when the runtime version 7.2.17 is released.
