Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11157 | 04-17-2018 04:59 PM | |
6115 | 04-11-2018 10:07 PM | |
3523 | 03-02-2018 09:13 AM | |
22121 | 03-01-2018 09:22 AM | |
2633 | 02-27-2018 08:06 AM |
09-07-2016
05:58 PM
It looks like your table metadata is in a strange state. How did you create the table exactly? Did you alter the table (e.g. add/remove columns)?
... View more
09-07-2016
04:58 PM
Just realized the docs do not yet have info on manually setting column stats: https://issues.cloudera.org/browse/IMPALA-3369 Here's the JIRA that enabled that feature, you can find the syntax in the comments. Feel free to ask questions if you are considering this workaround 🙂
... View more
09-07-2016
04:54 PM
Sorry this is causing you so much pain. One workaround is to update the table and column stats manually. The main idea is this: You can cut down the time for computing stats significantly by manually computing and setting the stats for those columns that actually need them. The columns used in predicates (including join predicates) should have stats. The column can be updated relatively infrequently. Setting the column stats manually is a relatively new feature. The table stats can also be computed and set manually, so e.g., if you've just added a new partition you can do a count(*) on that partition and set the #rows manually. You can find more info in the docs here: http://www.cloudera.com/documentation/enterprise/5-7-x/topics/impala_perf_stats.html#perf_stats
... View more
08-26-2016
07:06 AM
If you want to delete a whole database you can: drop database <dbname> cascade; That will drop the database and all its tables.
... View more
08-01-2016
05:41 PM
Did you enable short-cirtuit reads via these configurations? http://www.cloudera.com/documentation/enterprise/latest/topics/admin_hdfs_short_circuit_reads.html
... View more
07-29-2016
12:09 PM
1 Kudo
Filed the JIRA: https://issues.cloudera.org/browse/IMPALA-3938 Thanks a lot for your detailed report and easy reproduction!
... View more
07-29-2016
11:57 AM
Still investigating and working on the JIRA. In the meantimg, I think the query that you mean is this: select t.id, l.pos as location_number, m.key, m.value from mytable t join t.locations l join l.item m order by id, l.pos; The bug is that those queries returning wrong results are not semantically correct but Impala runs them anyway and gives "arbitrary" results.
... View more
07-26-2016
01:57 PM
Hi Thomas, in your example query the table alias 'a' has a 'pos' pseudo-column that refers to the element-position within the ARRAY, so I think it's indeed what you want. However, I think there is a bug here that may lead to confusion. The query you wrote "should" be illegal because you should not be able to access a.key and a.value without referencing the nested map in the FROM clause. I will follow-up with a JIRA, stay tuned. Can you try running this and see if you get the expected results? select c.id, a.pos from mytable c left join c.`location` a order by c.id, a.pos; Alternatively try this query if you also want to explode the items within the nested MAP: select c.id, a.pos, a.key, a.value from mytable c left join c.`location` a, a.item order by c.id, a.pos;
... View more
07-15-2016
09:29 PM
Please take a look at this thread for a response: http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Comma-delimited-string-to-individual-rows/m-p/41402#M1781
... View more
07-15-2016
09:12 AM
2 Kudos
Hi! Good question. Today, Impala is not aware of the heterogeneity and will split the work evenly among all available nodes - regardless of how much cpu/memory those nodes have.
... View more