Member since
10-30-2018
3
Posts
1
Kudos Received
0
Solutions
05-09-2019
02:09 AM
Quoted from documentation about using Avro files at https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_avro_usage.html#topic_26_2 """ Hive (…) To enable Snappy compression on output [avro] files, run the following before writing to the table: SET hive.exec.compress.output=true; SET avro.output.codec=snappy; """ Please try this out. You're missing only the second property mentioned here, which appears specific to Avro serialization in Hive. Default compression of Avro is deflate, so that explains the behaviour you observe without it.
... View more
04-05-2019
08:46 AM
1 Kudo
Hi Esteban, Our recommendation is to place [username] pools under a parent pool, such as root.users.[username]. This allows you to control overall usage of [username] queues relative to other pools under the root. This way, if you have 3 root level pools: root.production root.adhoc root.users ...you can define appropriate weights at amonst these 3 pools. If you configured root.[username], each user pool will be added with a default share of 1. So, for example, if your initial configuration was: Weight Pool 10 root.production 10 root.adhoc ... root.[username] placement rule In the beginning, root.production and root.adhoc will each have 50% of cluster resources. When the first user runs a job, their subpool is created with a default weight of 1. Weight Pool 10 root.production 10 root.adhoc 1 root.user1 Now imagine that 50 users run a job and 50 new pools are created at the root level with weight of 1. All of the sudden you have weighted resources heavily to the user pools: Weight Pool 10 root.production 10 root.adhoc 50 root.[username] generated pools (with 50 users) So in summary, you may prefer to do something that would limit all users, no matter how many, to a fixed ratio of cluster resources: Weight Pool 10 root.production 10 root.adhoc 10 root.users -> root.users.[username] placement rule Thanks, Nick
... View more