Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 12403 | 04-17-2018 04:59 PM | |
| 7642 | 04-11-2018 10:07 PM | |
| 4390 | 03-02-2018 09:13 AM | |
| 24599 | 03-01-2018 09:22 AM | |
| 3381 | 02-27-2018 08:06 AM |
12-30-2015
03:39 PM
1 Kudo
Hi! your scenario should work. Did you do "invalidate metadata <table>" in Impala after computing the stats in Hive? Also, Impala only deals with column stats at the table level, so if you compute the column stats for a specific partition in Hive, then those stats will not show up in Impala.
... View more
12-29-2015
03:34 PM
Impala does not have control of the physical locations of the HDFS blocks underlying Impala tables. The tables in Impala are backed by files on HDFS and those files are chopped into blocks and distributed according to your HDFS configuration, but for all practical purposes the blocks are distributed round-robin among the data nodes (grossly simplified). Impala queries typically run on all data nodes that store data relevant to answering a parcitular query, so given a fixed amount of data, you can indirectly control Impala's degree of (inter-node) parallelism by changing the HDFS block size. More blocks == more parallelism. If you are interested in learning about Impala, you may also find the CIDR paper useful: http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
... View more
12-16-2015
09:50 PM
I'm afraid that Impala currently does not support writing Avro data. Even though you can enable experimental support, we strongly advise against that, and instead recomment using another tool such as Hive or Kite to do the conversion. My apologies for the inconvenience.
... View more
10-26-2015
12:01 PM
Hi Saravana, thanks for your report! Looks like a new isue. Would you mind filing a JIRA for it so we can assign and track it? As a possible workaround, could you try running the query with "use native query" enabled in the JDBC driver? The driver will send the query to Impala verbatim (sometimes the driver may make some changes to the SQL). http://www.cloudera.com/content/cloudera/en/documentation/connectors/latest/PDF/Cloudera-JDBC-Driver-for-Impala-Install-Guide.pdf Alex
... View more
10-04-2015
05:40 PM
Thanks for the notice, it could be an oversight with the docs. I'll look into it.
... View more
10-02-2015
05:53 AM
Impala currently does not support truncating an individual partition, so that syntax error you get from the shell is expected. I am not sure why the statement appears to work from Hue. The TRUNCATE TABLE ... PARTITION syntax is not supported, so I don't see how this would work. Perhaps the error is somehow not properly shown/propagated to Hue?
... View more
08-10-2015
03:58 PM
Hi Tom, due to other complications, I'm afraid that patch didn't make it into CDH 5.4.4, but we will include it in CDH 5.4.5 which is tentatively scheduled for the beginning of September. Thanks for your patience, and my apologies that the fix did not make it into CDH 5.4.4. Alex
... View more
07-29-2015
11:47 AM
Since the differences in the two systems are due to their implementation, I'd say you have the following options: 1. Use a differnet type, e.g., STRING. When concerting from STRING to TIMESTAMP you will encounter the same issues though. 2. Change your ingestion pipeline to enforce a timestamp range that is valid in both systems. This assumes that your a date with year 0000 would be considered "garbage" by your application. 3. Live with the fact that a NULL timestamp could mean it is out of range. May I ask what exactly is causing the headache? The fact that both systems return different results or the fact that for your application the year 0000 is a meaningful date?
... View more
07-24-2015
12:15 AM
Do you really need dates before the year 1400 or after 10000? Impala has a different supported date range than Hive due to how timestamps are handled internally (Impala uses Boost, Hive uses the Java built-ins)
... View more
07-17-2015
12:11 AM
Thanks for the update. I can reproduce the issue, but only when the target partition is empty. As soon as I add some data, compute incremental stats works as expected. So I'm still thinking you are hitting an edge case with an empty partition?
... View more