About Tim Armstrong

Tim Armstrong · ‎02-05-2018

Gzip decompression will definitely use more CPU than snappy decompression, so I'd usually expect Gzip to give you worse performance, unless your query is limited by disk I/O (in which case smaller is better) or if your query isn't limited by scan performance.

Tim Armstrong · ‎01-17-2018

@spurusothamanusually we go through a couple of steps to troubleshoot issues like this. The two most likely solutions are: 1. Give the query more memory by increasing mem_limit or reducing # of concurrent queries 2. Adjust the SQL by rewriting the query or adding hints to get a different query plan to that avoids having so many duplicate values on the right side of the join. Depending on the exact scenario, the solution might be 1, 2, or both. straight_join is only useful if you use it to force a plan with a different join order. If you want input on whether you have a bad plan and what a better join order might be, please provide a query profile.

Tim Armstrong · ‎01-11-2018

That is a good suggestion, I went ahead and did it.

Tim Armstrong · ‎01-11-2018

I was able to reproduce it myself on several versions of CDH. I filed a bug report to track it: https://issues.apache.org/jira/browse/IMPALA-6389 . Thank you very much for letting us know about this.

Tim Armstrong · ‎01-11-2018

@1stSolothanks for the info, I'll look into it further. It definitely looks like a bug causing an Impala crash so I want to get to the bottom of it. Your workaround of using a different terminator should work.

Tim Armstrong · ‎01-10-2018

Thanks for letting us know about this and the clear steps. I wasn't able to reproduce the exact behaviour on my development version of Impala. What version of Impala are you seeing this in so that I can try to reproduce what you're seeing?

Tim Armstrong · ‎01-10-2018

Hi @Plop564, Thanks for the quality question. 1) It's also used by a lot of Impala builtin functions (we implement many with the exact same interface as UDFs). 2) It could be a bug in a UDF/UDAF. It's also possible that there's a bug in Impala where Impala isn't cleaning up the local allocations during some operation. E.g. in older versions if you ordered by a UDF that allocated memory, then a lot of that memory was only cleaned up at the end of the sort. It's possible that you could be allocating a lot of memory with FunctionContextImpl::Allocate() and Free() as well. The memory allocated in that way counts against the same limits as AllocateLocal() for the most part.

Tim Armstrong · ‎01-05-2018

@chophouseI wonder if the example I posted here would solve your problem: http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Why-not-from-unixtime-function-handles-an-unix-timestamp-in/m-p/63182#M3969

Tim Armstrong · ‎01-05-2018

I believe the original reason that it took seconds was for compatibility with Mysql's similarly-named function. You can convert a millisecond unix timestamp into an Impala timestamp more efficiently using integer division/mod and the "interval" operator as follows: [localhost:21000] > select cast(1513895588243 div 1000 as timestamp) + interval (1513895588243 % 1000) milliseconds; +------------------------------------------------------------------------------------------+ | cast(1513895588243 div 1000 as timestamp) + interval (1513895588243 % 1000) milliseconds | +------------------------------------------------------------------------------------------+ | 2017-12-21 22:33:08.243000000 | +------------------------------------------------------------------------------------------+ Fetched 1 row(s) in 0.01s

Tim Armstrong · ‎12-14-2017

@ClouderaksI think it depends on the nature of the failures - if they're critical queries or just users messing around.

Online	Offline
Last Visited	‎02-11-2021 06:07 PM

Member Since	‎07-29-2015 04:07 PM
Last Visited	‎02-11-2021 06:07 PM
Posts	535
Kudos received	141

Cloudera Community

Re: Impala Queries which were previously working a...

Re: Impala queries are not distributing to all the...

Re: impala - `recover partitions` points to old da...

Re: impala catalog server JVM

Re: Impala - On-demand metadata

Re: Recommended file size for Impala Parquet files...

Re: Memory limit exceeded cannot perform hash join

Re: Impala: "Cancelled due to unreachable impalad(...

Re: Impala: "Cancelled due to unreachable impalad(...

Re: Impala: "Cancelled due to unreachable impalad(...

Re: Impala: "Cancelled due to unreachable impalad(...

Re: FunctionContextImpl::AllocateLocal's allocatio...

Re: Get timestamp in milliseconds in Impala

Re: Why not from_unixtime() function handles an un...

Re: IMPALAD_QUERY_MONITORING_STATUS has become bad