Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11279 | 04-17-2018 04:59 PM | |
6223 | 04-11-2018 10:07 PM | |
3572 | 03-02-2018 09:13 AM | |
22344 | 03-01-2018 09:22 AM | |
2672 | 02-27-2018 08:06 AM |
06-14-2017
06:01 PM
Hi Shannon, Impala does not yet support array or map subscripts. What you can do is something like this: select a.value from mytable t, t.myarray a where a.pos in (0,1) But this will give you the "values" elements as rows and not in different columns.
... View more
06-12-2017
09:25 PM
I believe we've found and fixed the root cause of the spinning thread here IMPALA-5056
... View more
06-12-2017
10:49 AM
I'm inclined to say that yes there will be a difference. One or two example queries to show the alternatives would be helpful for me to give you a more accurate response.
... View more
06-09-2017
07:08 AM
You need to give your "equipment.location.name" an alias that is a valid column name. For example, create view ... select equipment.location.name as name ....
... View more
05-03-2017
06:08 PM
1 Kudo
Some thoughts on your question: - Hive is more flexible in terms of data formats that it can scan - You may find Hive to be more feature rich in terms of SQL language support and built-in functions - Hive will most likely complete your query even if there are node failures (this makes it suitable for long-running jobs); this is true for both Hive on MR and Hive on Spark - If Impala can run your ETL, then it will probably be faster - Impala will fail/abort a query if a node goes down during query execution - The last point may make Impala less suitable for long-running jobs, but of course there is also a shorter failure window because queries are faster, so Impala may very well suit your ETL needs if you can tolerate the faiure behavior You may also find this article interesting: https://vision.cloudera.com/sql-on-apache-hadoop-choosing-the-right-tool-for-the-right-job/
... View more
05-02-2017
10:12 PM
That's an interesting scenario. Unfotunately, there is no elegant way to do what you ask in Impala's SQL. The query you really want to write is this one: select * from (select *, parts from example ex order by value desc limit 2) v, v.parts The issue is that Impala currently does not support returning complex types in the select-list of an inline view: https://issues.apache.org/jira/browse/IMPALA-2777 Note that '*' in Impala expands to all scalar-typed colmns, so you need to list 'parts' explicitly.
... View more
04-27-2017
06:23 PM
1 Kudo
I believe you can append query options applied to that JDBC to the connection string like this: jdbc:impala://your_impalad.com:21050/default;UseNativeQuery=1;SET RUNTIME_FILTER_MODE=OFF; Alternatively, you should be able to run "SET RUNTIME_FILTER_MODE=OFF" as a query from JDBC to alter the defauly query options of that session.
... View more
04-24-2017
02:09 PM
My apologies for this unsightly issue and error message. You are running into an issue with a relativey new feature - runtime filters. We have identified and fixed the issue in later versions, see: https://issues.apache.org/jira/browse/IMPALA-4076 Please first check that the relevant tables have stats. As a workaround, you may either disable runtime filters: SET RUNTIME_FILTER_MODE=OFF; or you can increase the number of allowed filters per-query: SET MAX_NUM_RUNTIME_FILTERS=100; The issue only occurs for complex queries where the number of runtime filters exceed the per-query runtime filter budget (we sort all runtime filters and try to pick the best top-N, which is where this issue is happening).
... View more
04-20-2017
11:14 PM
1 Kudo
Thanks for investigating. We've confirmed internally that the issue is related to Avro with many columns. 900 is somewhat wide. Thanks for reporting! We'll continue to look into this issue.
... View more