Member since
10-16-2013
307
Posts
77
Kudos Received
59
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
11279 | 04-17-2018 04:59 PM | |
6223 | 04-11-2018 10:07 PM | |
3574 | 03-02-2018 09:13 AM | |
22349 | 03-01-2018 09:22 AM | |
2672 | 02-27-2018 08:06 AM |
01-06-2016
08:49 AM
thx 🙂
... View more
01-06-2016
12:20 AM
Alex, Thank you very much. 🙂 -- Moonwon (Gatsby) Lee gatsbylee.com "Life isn't about waiting for the storm to pass, it's about learning to dance in the rain."
... View more
01-04-2016
04:53 PM
Thx. 🙂
... View more
12-29-2015
03:34 PM
Impala does not have control of the physical locations of the HDFS blocks underlying Impala tables. The tables in Impala are backed by files on HDFS and those files are chopped into blocks and distributed according to your HDFS configuration, but for all practical purposes the blocks are distributed round-robin among the data nodes (grossly simplified). Impala queries typically run on all data nodes that store data relevant to answering a parcitular query, so given a fixed amount of data, you can indirectly control Impala's degree of (inter-node) parallelism by changing the HDFS block size. More blocks == more parallelism. If you are interested in learning about Impala, you may also find the CIDR paper useful: http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper28.pdf
... View more
12-17-2015
02:58 AM
That's what I was suspecting would be the answer after talking with a coworker, re-reading a Cloudera blog post, and experimentation Hive. I'll have to check into kite.
... View more
10-30-2015
01:35 AM
useNativeQuery option is an workaround for this problem. Filed this as a bug in Jira IMPALA-2609.
... View more
10-04-2015
05:40 PM
Thanks for the notice, it could be an oversight with the docs. I'll look into it.
... View more
08-10-2015
03:58 PM
Hi Tom, due to other complications, I'm afraid that patch didn't make it into CDH 5.4.4, but we will include it in CDH 5.4.5 which is tentatively scheduled for the beginning of September. Thanks for your patience, and my apologies that the fix did not make it into CDH 5.4.4. Alex
... View more
07-29-2015
11:47 AM
Since the differences in the two systems are due to their implementation, I'd say you have the following options: 1. Use a differnet type, e.g., STRING. When concerting from STRING to TIMESTAMP you will encounter the same issues though. 2. Change your ingestion pipeline to enforce a timestamp range that is valid in both systems. This assumes that your a date with year 0000 would be considered "garbage" by your application. 3. Live with the fact that a NULL timestamp could mean it is out of range. May I ask what exactly is causing the headache? The fact that both systems return different results or the fact that for your application the year 0000 is a meaningful date?
... View more
07-17-2015
12:11 AM
Thanks for the update. I can reproduce the issue, but only when the target partition is empty. As soon as I add some data, compute incremental stats works as expected. So I'm still thinking you are hitting an edge case with an empty partition?
... View more