Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11248 | 04-17-2018 04:59 PM | |
6191 | 04-11-2018 10:07 PM | |
3559 | 03-02-2018 09:13 AM | |
22289 | 03-01-2018 09:22 AM | |
2653 | 02-27-2018 08:06 AM |
04-03-2017
04:44 PM
We are facing the same issue with not being able to calculate table stats ( we run several large tables), is upgrading to impala 2.8 only fix? we are running cloudera 5.9, will upgrading to impala 2.8 cause any issues? Would think calculating table stats on large table is a common workflow for most clients. Is it possible to get a patch of this on Impala 2.7 ? Thanks
... View more
03-21-2017
12:08 AM
Alex, Thank you again. Subquery approach has been recommended to our team as a long term solution. However, for short-tem solution to avoid regression impact, using view with limited partitions has been selected. If I remember correctly, in MySQL `table A` data can be limited by `ON Clause` before joining so that cadidates for join can be reduced. Thank you for your valuable comment. Gatsby
... View more
02-23-2017
06:47 PM
Thomas, you have a legitimate request and concern. First, there is no perfectly fool-proof solution because the resource consumption is somewhat dependent on what happens at runtime, and not all memory consumption is tracked by Impala (but must is). We are constantly making improvements in this area though. 1. I'd recommend fixing the num_scanner_threads for your queries. A different number of scanner threads can result in different memory consumption from run to run (and dependent on what else is going on in the system at the time). 2. The operators of a query do not run one-by-one. Some of them run concurrently (e.g. join builds may execute concurrently). So just looking at the highest peak in the exec summary is not enough. Taking the sum of the peaks over all operators is a safer bet, but tends to overestimate the actual consumption. Hope this helps!
... View more
02-22-2017
01:43 PM
I just saw this thread after commenting on the Jira. Would "conv()" be a suitable workaround here? select conv('100010', 2, 10);
+-----------------------+
| conv('100010', 2, 10) |
+-----------------------+
| 34 |
+-----------------------+
Fetched 1 row(s) in 0.24s More information on conv() can be found in the Impala documentation. Edit: To make things complete, the Jira is IMPALA-4968.
... View more
02-02-2017
10:07 PM
@gaurang would you be open to sharing your CREATE TABLEs, CREATE VIEW and the query that has slow planning time? No need for the data, just that should be sufficient for us to understand better what's going on. Like Lars said, you are probably hitting IMPALA-4242 which explains the slow equivalence class computation, but I'd also like to understand the slow single-node planning time. Thanks!
... View more
01-31-2017
04:36 PM
FYI, `COMPUTE STATS` can run with first level partition. https://issues.cloudera.org/browse/IMPALA-1570
... View more
01-26-2017
02:13 PM
Thanks again, and please be aware the incorrect text is also found here: https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_perf_hdfs_caching.html "When data is requested to be pinned in memory, that process happens in the background without blocking access to the data while the caching is in progress. Loading the data from disk could take some time. Impala reads each HDFS data block from memory if it has been pinned already, or from disk if it has not been pinned yet. When files are added to a table or partition whose contents are cached, Impala automatically detects those changes and performs a REFRESH automatically once the relevant data is cached."
... View more
01-17-2017
05:42 PM
Yes. Use a spark-hbase-connector.
... View more
01-17-2017
05:23 PM
In Impala, a table can be created by using the ‘CREATE Table’ command. Let us understand the general syntax of creating a table in Impala with the help of the commands shown on the screen. The ‘PARTITIONED BY’ clause partitions data files based on one or more specified columns values.
... View more