Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11136 | 04-17-2018 04:59 PM | |
6108 | 04-11-2018 10:07 PM | |
3519 | 03-02-2018 09:13 AM | |
22097 | 03-01-2018 09:22 AM | |
2615 | 02-27-2018 08:06 AM |
02-11-2015
11:13 PM
First, the reason why the query is rejected is because Impala currently has no efficient way to process that join. Impala only implements hash join and disjunctoive conditions (with OR) are not hashable. That said, since Impala 2.0 we do support executing such queries, albeit inefficiently. They will be run via a CROSS JOIN + filter, so I'd advise against running such queries on very large tables. Earlier Impala versions cannot run your query. Hope this helps!
... View more
02-11-2015
10:54 PM
1 Kudo
The error sounds like the ODBC connection was successful, but the Impala query failed to parse. A common issue is sending a SQL string that is terminated by a semicolon. Impala cannot parse such a query. Remove the trailing semicolon to make it work. Future versions of Impala will also accept a trailing semicolon. Let me know if this olves ths issue
... View more
01-30-2015
04:51 PM
1 Kudo
Hi MickeyMouse, my understanding is that you are looking to answer k-neirest neighbour (kNN) queries, i.e., given a query lat/long, find the k nearest lat/long in the dataset. You'd be able to answer such queries in Impala by tranforming the kNN query into a swries of range queries (keep increasing the range until you've found at least k answers). One way to make the range queries effficient could be to partition your table based on lat and/or long. Of course, there are many distinct lat/long values, so you'd probably need to create buckets in the space of lat/long values (e.g. a grid structure). Then you'd need to transform the original lat/long values given in the query into the grid space. This way Impala's partition pruning will ick in and you'd be restricted to searcing data in those grid cells that overlap with the specified range. Just a high-level idea, hope it makes sense. I'm afraid there is no easy and efficient wayto directly answer kNN queries in Impala today. Alex
... View more
12-29-2014
01:26 PM
I'm not sure I follow your question. What exactly is "dual"? Can you provide more information?
... View more
10-17-2014
07:12 PM
1 Kudo
Hi Sreeman, thanks for following up in such great detail! I completely agree that DDL commands should never leave a table in a state where it cannot even be dropped. As you pointed out, this seems to be a Hive issue. I share your concern regarding the behavior of ALTER TABLE, however, there are legitimate uses for it. For example, if a user accidentally set the wrong format, she can change it later. You can also alter partitions of the same table to have different file formats. Last, users may have the opposite expectation as you do, and having ALTER rewrite all data into a new format against a users expectation is arguably worse than later realizing that ALTER only changes the table metadata. Your suggestion of issuing a warning sounds like a good compromise, thanks!
... View more
10-16-2014
05:10 PM
1 Kudo
I wasn't able to repro the issue on 1.4 with these steps: 1. hive> create table t (i int); 2. hive> insert into t select 1 from someexistingtable; 3. impala> invalidate metadata t; 4. impala> alter table t set fileformat parquet; 5. impala> descrobe t; Could you detail the steps used to produce the issue on your end so I can investigate futher? Thanks!
... View more
10-16-2014
03:26 PM
Thanks for following up, I'm looking into the problem since it looks like a bug. Still I want to make sure to address your ultimate goal: My understanding is that you created a text table in Hive, and then used ALTER TABLE <tbl> SET FILEFORMAT PARQUET in Impala. Then "DESCRIBE <tbl>" fails in Impala. What's the intention behing the ALTER command? The alteration will only chgange the table metadata, and not the data itself, i.e., the data is still stored in text, but Impala will attempt to interpret the text files as Parquet and fail.
... View more
10-15-2014
08:51 AM
What version of Impala are you running?
... View more
04-09-2014
11:23 PM
Hive and Impala only support non-materialized VIEWs that don't contain any actual data. So when you write a query select * from customsink_lat both Hive and Impala replace 'customsink_lat' with the view-definition SQL like this: select v.* from (SELECT id, clmndate,clmnvalue FROM customsink LATERAL VIEW explode(props) cl AS clmndate,clmnvalue) v Impala doesn't support LATERAL VIEW or complex types such as MAP, ARRAY, etc., so Impala isn't able to execute the view-definition SQL. My apologies for the inconvenience.
... View more
10-16-2013
12:05 PM
1 Kudo
Hi Sergey, the Hive metastore uses Datanucleus for object-relational mapping. Datanucleus has a bunch of known issues, in particular, related to concurrency (e.g., HIVE-3826, HIVE-5457, HIVE-5181). From your description, I got the impression that there may be concurrent things happening while you invalidate metadata. I know it's not ideal, but I'd encourage you to avoid that, and also have a look at those JIRAs to see if you think they might apply. I'm happy to investigate further, but for now, I'm not exactly sure what the problem is. If it is indeed a concurrencey problem, I believe simply retrying invalidate may do the trick (again not ideal, sorry).
... View more
- « Previous
- Next »