Member since
11-25-2019
5
Posts
3
Kudos Received
0
Solutions
12-10-2020
07:03 AM
Tim, We have solved this by the combined efforts of Cloudera Support, Microsoft Support, you and us. GET_COLUMNS takes 4 arguments: catalogName = IMPALA schemaName = DB_NAME tableName = TABLE_NAME columnName = …. The way we had configured the external table in MS SQL Polybase 2019, only submitted the IMPALA (default) and TABLE_NAME. Ss soon as we entered the full string: IMPALA.DB_NAME.TABLE_NAME…. the run time reduced with 90 pct. Thanks you so much for your contribution. Best regards Henrik
... View more
11-04-2020
07:54 AM
Tim, Thank you so much. We have been migrated from CDH 5 to CDH 6 (since CDH 5 is anyway closed down). We will try investigate using your hints.
... View more
10-15-2020
07:46 AM
I managed to get some more logging details. To me this looks like a real query, not reading from any cache. I1015 15:49:44.158821 24416 impala-hs2-server.cc:591] GetColumns(): request=TGetColumnsReq { 01: sessionHandle (struct) = TSessionHandle { 01: sessionId (struct) = THandleIdentifier { 01: guid (string) = "\xcd\xc8\xb1\xe8\xf0\xa3H[\x82\xf1\"u\xe0~SP", 02: secret (string) = "\xfd\x81y\xa1\xf3*A\xb0\xae\x1e\a\xd7\f\xa3\xf0I", }, }, 04: tableName (string) = "mi\\_report\\_workflow\\_fact", 05: columnName (string) = "%", } I1015 15:49:44.638825 24416 impala-server.cc:981] Query f94c41fd14e503dd:10ba472900000000 has timeout of 5m I1015 15:49:44.638913 24416 impala-hs2-server.cc:604] GetColumns(): return_val=TGetColumnsResp { 01: status (struct) = TStatus { 01: statusCode (i32) = 0, }, 02: operationHandle (struct) = TOperationHandle { 01: operationId (struct) = THandleIdentifier { 01: guid (string) = "\xdd\x03\xe5\x14\xfdAL\xf9\x00\x00\x00\x00)G\xba\x10", 02: secret (string) = "\xdd\x03\xe5\x14\xfdAL\xf9\x00\x00\x00\x00)G\xba\x10", }, 02: operationType (i32) = 6, 03: hasResultSet (bool) = true, }, } I1015 15:49:44.640828 24416 impala-hs2-server.cc:699] CloseOperation(): query_id=f94c41fd14e503dd:10ba472900000000 I1015 15:49:44.640846 24416 impala-server.cc:992] UnregisterQuery(): query_id=f94c41fd14e503dd:10ba472900000000 I1015 15:49:44.640856 24416 impala-server.cc:1075] Cancel(): query_id=f94c41fd14e503dd:10ba472900000000
... View more
10-14-2020
08:24 AM
Tim, Thanks! There's no INVALIDATE METADATA. I have the profile, but it does not say so much: Query (id=fb4068c90a80e5dd:33fefd200000000):
Summary:
Session ID: 4944df8a0226e9cf:2659b7a91151868c
Session Type: HIVESERVER2
HiveServer2 Protocol Version: V6
Start Time: 2020-10-14 16:48:25.334065000
End Time: 2020-10-14 16:48:25.840527000
Query Type: DDL
Query State: FINISHED
Query Status: OK
Impala Version: impalad version 2.11.0-cdh5.14.0 RELEASE (build d68206561bce6b26762d62c01a78e6cd27aa7690)
User: SVC9125
Connected User: SVC-SQLs-000499@QAONEADR.LOCAL
Delegated User: SVC9125
Network Address: 10.96.148.140:51260
Default Db: discovery_gfccmia
Sql Statement: GET_COLUMNS
Coordinator: ap-hdpdn10t.oneadr.net:22000
Query Timeline: 507.235ms
- First row fetched: 504.979ms (504.979ms)
- Unregister query: 506.469ms (1.490ms) I am not sure about if we have any non-standard configurations. I am a business user and will try to get this from the sys admin. I noticed we have the load_catalog_in_background set to recommended FALSE. But maybe this recommendation don't work for ODBC connections?
... View more
10-12-2020
12:40 AM
On the queries page: Queries: This page lists all running queries, plus any completed queries that are archived in memory. The size of that archive is controlled with the --query_log_size command line parameter I see GET_COLUMNS which should gather some metadata (tables and column names) from a relational store. Puzzling is that it runs for every query launched through the Cloudera Impala ODBC driver, but not when using the HUE interface. It takes around 500 ms for each query, and since thes metadata only changes daily, it would be nice to be able to cache the information. On the other hand, when running the same query through HUE Impala, it takes around 100 times more time (500 ms goes to 50 seconds) so these metadata definitely have a positive effect.
... View more
Labels:
- Labels:
-
Cloudera Data Platform (CDP)