About dsss

dsss · ‎02-01-2017

Thanks @thewayofthinkin. I'm aware for the cancel button, but in production env, where you can have massive amount of queries in EXCEPTION mode, canceling queries manually is not the right solution. I wonder if it can be fixed by configuration or script. Anyway, it looks like a bug to me. Have anyone experienced this issue? If yes, I would like to hear what was your action items. Thanks Dror

dsss · ‎01-26-2017

Thanks you @saranvisa @mathieu.d

dsss · ‎01-26-2017

Hey, I'm investigating memory usage for queries, and noticed to the following parameters (user_name regards to the query initiator): - admission-controller.agg-mem-reserved.root.user_name - admission-controller.local-backend-mem-reserved.root.user_name - admission-controller.local-backend-mem-usage.root.user_name - admission-controller.local-mem-admitted.root.user_name Does someone can give me an explanation for these params? The explanation in impala documentation is not enough. In addition, It will be great if someone could guide me how to monitor the queries actual memory usage - For example - For each node, what is the size in bytes of the query fragment, and the total memory usage in all the cluster for the specific query. Many thanks, Dror

dsss · ‎01-25-2017

Hey all, I'm using Impala 2.2.0, cdh 5.4.3. There are times that I have queries that had been canceled by the user, but they're appear in the deamon web UI in "Queries in flight" tab. These queries in EXCEPTION state, and when I look into their profile, I can see that the status is "Cancelled". Sounds great, BUT - when I look deeper, and see the memory usage for the specific query, It looks like the cancel command didn't free the memory. In case of massive queries, this issue harm concurrency, and the coordinator host become a bottleneck. (As u can see in the attached graph). Any ideas? Thanks Dror

dsss · ‎01-19-2017

Hey all, Our production env use Impala 2.2.0, CDH 5.4.3. Does impala has a function to transpose columns to rows? Currently , in order to do so, I need to perform seperate queries, which filter the specific column, and union them. Because the source table is huge, this solution is not good at all. Any ideas? Here's a simple example: -- Source table SELECT * FROM ( SELECT 'a' AS a,'b' AS b,'c' AS c,'d' AS d ) tmp -- Desired output SELECT a AS selected_value FROM ( SELECT 'a' AS a,'b' AS b,'c' AS c,'d' AS d ) tmp UNION all SELECT b AS selected_value FROM ( SELECT 'a' AS a,'b' AS b,'c' AS c,'d' AS d ) tmp UNION all SELECT c AS selected_value FROM ( SELECT 'a' AS a,'b' AS b,'c' AS c,'d' AS d ) tmp UNION all SELECT d AS selected_value FROM ( SELECT 'a' AS a,'b' AS b,'c' AS c,'d' AS d ) tmp Thanks! Dror

dsss · ‎01-19-2017

Many thanks Henry! Last question - When using joins, does runtime filter works on INNER JOIN solely? When I checked the profile output after performing LEFT JOIN I didn't see any mention for runtime filter. Thanks

dsss · ‎01-17-2017

Hey, Thanks for the quick response. How can I check the actual partitions number that been read? I read the profile output for this query, but there is to much information there. Here's the output for the explain statement for EXPALIN_1 step. It seems like the runtime filter enbaled. Estimated Per-Host Requirements: Memory=2.03GB VCores=2 WARNING: The following tables are missing relevant table and/or column statistics. adb.test_1, adb.test_prt 04:EXCHANGE [UNPARTITIONED] | 02:HASH JOIN [LEFT SEMI JOIN, BROADCAST] | hash predicates: t.day = day | runtime filters: RF000 <- day | |--03:EXCHANGE [BROADCAST] | | | 01:SCAN HDFS [adb.test_1] | partitions=1/1 files=1 size=9B | 00:SCAN HDFS [adb.test_prt t] partitions=3/3 files=3 size=12B runtime filters: RF000 -> t.day Thanks!

dsss · ‎01-17-2017

Hey all, I'm using CDH 5.9, Impala 2.7. I'm examining the runtime-filter feature, but it's not working as I expected. Here's an example of my case: -- Table 1 - Partitioned by day CREATE TABLE adb.test_prt (ID string) partitioned BY (day INT); INSERT INTO adb.test_prt PARTITION(day) SELECT 'a' AS ID ,20170102 AS day UNION all SELECT 'b' AS ID ,20170102 AS day UNION ALL SELECT 'c' AS ID,20170102 AS day UNION ALL SELECT 'd' AS ID ,20170103 AS day UNION ALL SELECT 'd' AS ID ,20170105 AS day UNION all SELECT 'g' AS ID ,20170105 AS day SELECT * FROM adb.test_prt show partitions adb.test_prt -- Table 2 - raw data CREATE TABLE adb.test_1 (day INT) INSERT INTO adb.test_1 SELECT 20170102 AS day SELECT * FROM adb.test_1 --################################### -- explain 1 explain SELECT * FROM adb.test_prt t WHERE t.day IN (SELECT day FROM adb.test_1) output : partitions=3/3 files=3 size=12B -- explain 2 explain SELECT * FROM adb.test_prt t WHERE t.day IN (20170102) output: partitions=1/3 files=1 size=6B I don't understand why there is a difference between the outputs. Table adb.test_1 has only one value which match to specific partition in adb.test_prt. I'm expecting from the runtime filter to figure this out. What am I missing? Another question: Is this feature support joins as well, rather then where clause? Here's an example explain SELECT * FROM adb.test_prt t inner JOIN adb.test_1 a ON t.day=a.day output: partitions=3/3 files=3 size=12B runtime filters: RF000 -> t.day Thanks! Dror

Online	Offline
Last Visited	‎04-05-2017 03:00 AM

Member Since	‎11-07-2016 12:40 AM
Last Visited	‎04-05-2017 03:00 AM
Posts	9

Cloudera Community

Re: queries in flight issue

Re: Transpose columns to rows

Impala query memory usage

queries in flight issue

Transpose columns to rows

Re: Impala runtime filter not working as expected

Re: Impala runtime filter not working as expected

Impala runtime filter not working as expected