About aervits

aervits · ‎01-05-2016

@Pedro Gandola the local indexes are in tech preview and as all TP releases there is no support from HWX until it's production ready. If you do find a solution, please post here for the best of the community.

aervits · ‎01-05-2016

additionally, did you see the warning about using local indexes in Phoenix as it being a technical preview? The local indexing feature is a technical preview and considered under development. Do not use this feature in your production systems. If you have questions regarding this feature, contact Support by logging a case on our Hortonworks Support Portal.

aervits · ‎01-05-2016

@Pedro Gandola do you have HBase Master High Availabily on? We recommend to run at least two masters at the same time. Also, we recommend you use Ambari rolling restart rather than stop-the-world restart of the whole cluster. With HA enabled, you can have one HBase master down and still maintain availability. You can also restart regions one at a time or trigger restart every so often, you can set time trigger for RS restarts. Time of stopping everything to change a configuration in hbase-site is long gone, you don't need to stop the whole cluster.

aervits · ‎01-05-2016

your requirements look good according to this doc http://hortonworks.com/wp-content/uploads/2015/10/Hortonworks-Hive-ODBC-Driver-User-Guide.pdf I cannot comment on your version of Mac OS, I'm wondering if 10.11 is not yet supported as the requirements only say 10.6 or later. Please paste the error you're getting and we can escalate up the chain.

aervits · ‎01-05-2016

you can enable hcatalog commands in pig programmatically, use the steps in the following article https://community.hortonworks.com/questions/1954/hcatbin-is-not-defined-define-it-to-be-your-hcat-s.html

aervits · ‎01-05-2016

I'm experimenting with Groovy scripts as custom UDFs in Hive and I noticed that I can't use the same syntax in beeline as in hive shell for executing custom UDFs. Is it a supported feature and syntax is different or is it not supported altogether? The following works as is in hive shell, in beeline it throws error compile `import org.apache.hadoop.hive.ql.exec.UDF \; import groovy.json.JsonSlurper \; import org.apache.hadoop.io.Text \; public class JsonExtract extends UDF { public int evaluate(Text a){ def jsonSlurper = new JsonSlurper() \; def obj = jsonSlurper.parseText(a.toString())\; return obj.val1\; } } ` AS GROOVY NAMED json_extract.groovy; hive> CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract'; hive> select json_extract('{"val1": 2}') from date_dim limit 1; select json_extract('{"val1": 2}') from date_dim limit 1 OK 2

aervits · ‎01-05-2016

furthermore, there's a great how-to guide for setting up capacity scheduler https://community.hortonworks.com/articles/238/configuring-yarn-capacity-scheduler-with-ambari.html

aervits · ‎01-05-2016

this is best practices for ordering algorithms and not migration per se, you may find more information by reading the Capacity Scheduler document http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-... 10.3. Best Practices for Ordering Policies Ordering policies are configured on a per-queue basis, with the default ordering policy set to FIFO. Fairness is usually best for on-demand, interactive, or exploratory workloads, while FIFO can be more efficient for predictable, recurring batch processing. You should segregate these different types of workloads into queues configured with the appropriate ordering policy. In queues supporting both large and small applications, large applications can potentially "starve" (not receive sufficient resources). To avoid this scenario, use different queues for large and small jobs, or use size-based weighting to reduce the natural tendency of the ordering logic to favor smaller applications. Use the yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent property to restrict the number of concurrent applications running in the queue to avoid a scenario in which too many applications are running simultaneously. Limits on each queue are directly proportional to their queue capacities and user limits. This property is specified as a float, for example: 0.5 = 50%. The default setting is 10%. This property can be set for all queues using the yarn.scheduler.capacity.maximum-am-resource-percent property, and can also be overridden on a per-queue basis using the yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent property.

aervits · ‎01-05-2016

Here's some information for you regarding job preemption, there are some parameters you can tune now and others are on the roadmap. http://hortonworks.com/blog/better-slas-via-resour... This is how you would turn job preemption on HDP 2.3 http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-...

aervits · ‎01-05-2016

please accept the answer to close the thread.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Re: Can phoenix local indexes create a deadlock du...

Re: Can phoenix local indexes create a deadlock du...

Re: Can phoenix local indexes create a deadlock du...

Re: Hive ODBC Driver does not install on Mac OS 10...

Re: How to make pig scripts should choose both hca...

status of groovy custom udfs in beeline

Re: Migrating from Fair Scheduler to Capacity Sche...

Re: Migrating from Fair Scheduler to Capacity Sche...

Re: CapacityScheduler - Job Priority preemption

Re: Unable to build Spark application due to missi...