About alex.behm

alex.behm · ‎02-11-2015

First, the reason why the query is rejected is because Impala currently has no efficient way to process that join. Impala only implements hash join and disjunctoive conditions (with OR) are not hashable. That said, since Impala 2.0 we do support executing such queries, albeit inefficiently. They will be run via a CROSS JOIN + filter, so I'd advise against running such queries on very large tables. Earlier Impala versions cannot run your query. Hope this helps!

alex.behm · ‎02-11-2015

The error sounds like the ODBC connection was successful, but the Impala query failed to parse. A common issue is sending a SQL string that is terminated by a semicolon. Impala cannot parse such a query. Remove the trailing semicolon to make it work. Future versions of Impala will also accept a trailing semicolon. Let me know if this olves ths issue

alex.behm · ‎01-30-2015

Hi MickeyMouse, my understanding is that you are looking to answer k-neirest neighbour (kNN) queries, i.e., given a query lat/long, find the k nearest lat/long in the dataset. You'd be able to answer such queries in Impala by tranforming the kNN query into a swries of range queries (keep increasing the range until you've found at least k answers). One way to make the range queries effficient could be to partition your table based on lat and/or long. Of course, there are many distinct lat/long values, so you'd probably need to create buckets in the space of lat/long values (e.g. a grid structure). Then you'd need to transform the original lat/long values given in the query into the grid space. This way Impala's partition pruning will ick in and you'd be restricted to searcing data in those grid cells that overlap with the specified range. Just a high-level idea, hope it makes sense. I'm afraid there is no easy and efficient wayto directly answer kNN queries in Impala today. Alex

alex.behm · ‎12-29-2014

I'm not sure I follow your question. What exactly is "dual"? Can you provide more information?

alex.behm · ‎10-17-2014

Hi Sreeman, thanks for following up in such great detail! I completely agree that DDL commands should never leave a table in a state where it cannot even be dropped. As you pointed out, this seems to be a Hive issue. I share your concern regarding the behavior of ALTER TABLE, however, there are legitimate uses for it. For example, if a user accidentally set the wrong format, she can change it later. You can also alter partitions of the same table to have different file formats. Last, users may have the opposite expectation as you do, and having ALTER rewrite all data into a new format against a users expectation is arguably worse than later realizing that ALTER only changes the table metadata. Your suggestion of issuing a warning sounds like a good compromise, thanks!

alex.behm · ‎10-16-2014

I wasn't able to repro the issue on 1.4 with these steps: 1. hive> create table t (i int); 2. hive> insert into t select 1 from someexistingtable; 3. impala> invalidate metadata t; 4. impala> alter table t set fileformat parquet; 5. impala> descrobe t; Could you detail the steps used to produce the issue on your end so I can investigate futher? Thanks!

alex.behm · ‎10-16-2014

Thanks for following up, I'm looking into the problem since it looks like a bug. Still I want to make sure to address your ultimate goal: My understanding is that you created a text table in Hive, and then used ALTER TABLE <tbl> SET FILEFORMAT PARQUET in Impala. Then "DESCRIBE <tbl>" fails in Impala. What's the intention behing the ALTER command? The alteration will only chgange the table metadata, and not the data itself, i.e., the data is still stored in text, but Impala will attempt to interpret the text files as Parquet and fail.

alex.behm · ‎10-15-2014

What version of Impala are you running?

alex.behm · ‎04-09-2014

Hive and Impala only support non-materialized VIEWs that don't contain any actual data. So when you write a query select * from customsink_lat both Hive and Impala replace 'customsink_lat' with the view-definition SQL like this: select v.* from (SELECT id, clmndate,clmnvalue FROM customsink LATERAL VIEW explode(props) cl AS clmndate,clmnvalue) v Impala doesn't support LATERAL VIEW or complex types such as MAP, ARRAY, etc., so Impala isn't able to execute the view-definition SQL. My apologies for the inconvenience.

alex.behm · ‎10-16-2013

Hi Sergey, the Hive metastore uses Datanucleus for object-relational mapping. Datanucleus has a bunch of known issues, in particular, related to concurrency (e.g., HIVE-3826, HIVE-5457, HIVE-5181). From your description, I got the impression that there may be concurrent things happening while you invalidate metadata. I know it's not ideal, but I'd encourage you to avoid that, and also have a look at those JIRAs to see if you think they might apply. I'm happy to investigate further, but for now, I'm not exactly sure what the problem is. If it is indeed a concurrencey problem, I believe simply retrying invalidate may do the trick (again not ideal, sorry).

Online	Offline
Last Visited	‎05-10-2018 06:52 PM

Member Since	‎10-16-2013 11:04 AM
Last Visited	‎05-10-2018 06:52 PM
Posts	307
Kudos received	77

Cloudera Community

Re: External Table from Parquet folder returns emp...

Re: Impala SQL for KUDU does not work

Re: Impalad logs diskspace full

Re: Impala round function does not return expected...

Re: Is Impala a proces engine when I use kudu?

Re: OR operator not supported in Impala

Re: obtain error while connecting to Impala via Po...

Re: Range queries in Impala

Re: impala alternative to oracle dual table

Re: com.cloudera.impala.common.AnalysisException: ...

Re: com.cloudera.impala.common.AnalysisException: ...

Re: com.cloudera.impala.common.AnalysisException: ...

Re: com.cloudera.impala.common.AnalysisException: ...

Re: Impala: Failed to parse view-definition statem...

Re: INVALIDATE METADATA suddenly caused: javax.jdo...