Member since
09-14-2016
116
Posts
2
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5397 | 08-01-2017 08:04 AM |
08-11-2017
10:46 AM
But will the total files too many? i had smaller sized files before with 512M, was slow as well. we only have 5 nodes now so too many files will not help? Thanks Shannon
... View more
08-11-2017
09:52 AM
Thanks Lars, trying to understand how it appies here, should i try to increate the blozks size to 4G, and keep each file under 4G (3-4G)? Shannon
... View more
08-11-2017
07:07 AM
I ran compute stats. I didnot do any compression, as i read it will slow down. Shannon
... View more
08-11-2017
06:04 AM
Took a quick look at bucketing, i dont have one column that is used the most, and this table will join other tables later. As far as controlling the number of files, since i have an idea of the data size, i can controll the file size when doing insert -select i can set hive.merge.size.per.task and hive.merge.smallfiles.avgsize. My questions, 1, in this case, how big can i set the block size, 1G, 2G, will it hurt if set to big? 2, how big should i use for the file, right now i set to 4G, should i increase? 3, i have 5 nodes, with that for each partition it is better to have at least 5 files? Thanks Shannon
... View more
08-11-2017
05:00 AM
Java Heap Size of Catalog Server in Bytes - i set to 4G Java Heap Size of Impala Llama ApplicationMaster in Bytes - does this have anything to do with impala query?
... View more
08-11-2017
04:57 AM
Thanks, i will take a look at bucketing. Do you mean run compute stats, yes i did that.
... View more
08-10-2017
08:08 PM
When i run a simple select with where id ='xxx', i see there are 1111 jobs/tasks(?), it is bcoz my block size is 512 and total is 500G, so 1000+ blocks to scan? If so i should increase my block size?
... View more
08-10-2017
07:47 PM
Also block size is 512M
... View more
08-10-2017
07:34 PM
1 Kudo
HI, In impala config, i am setting mem limit to 24G, there is also a java heap size, i dont know if this is used by impala queries? and whats the relation with mem limit, and what should i set, if i have physical say 32G, how big can i set on java heap? Thanks Shannon
... View more
Labels:
- Labels:
-
Apache Impala
08-10-2017
07:32 PM
HI, I have a big table, partitioned by year/month,total size 500+G, 19 partitions so far, each partion right now about 7-8 files each about 3-4G (total 138 files) i have five nodes running impala, it is very slow for a simple select, should i reduce the number of files in each partition and increase the size of each file? Thanks Shannon
... View more
Labels:
- Labels:
-
Apache Impala