- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Possible to Utilize CBO in the Hive view?
- Labels:
-
Apache Ambari
-
Apache Hive
Created 01-05-2016 05:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm following a tutorial on the Hortonworks site, http://hortonworks.com/hadoop-tutorial/superchargi...
I'm wondering if it is possible to get the CBO working. I've tried setting the 4 params and getting the table stats, but for some reason I'm getting an error which looks like:
org.apache.ambari.view.hive.client.HiveErrorStatusException: H110 Unable to submit statement. Error while processing statement: Cannot modify set hive.cbo.enable at runtime. It is not in list of params that are allowed to be modified at runtime [ERROR_STATUS]
I get the error across 3 different params. How do I get the CBO working?
Created 01-06-2016 07:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS; ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;
Created on 01-05-2016 05:56 PM - edited 08-19-2019 05:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In recent HDP versions you don't have to explicitly enable it, it should be on by default already. You can check via Ambari -> Hive -> Configs. It will have the CBO switch like below:
Created 01-06-2016 03:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Andrew! I did not know that. I probably should have done some digging myself to figure that out. However I'm still having issue when I run an EXPLAIN and it says my query is not optimized by CBO. See my comment to @Deepesh's response
Created 01-05-2016 07:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error message seems to indicate that you cannot simply enable CBO for your session or connection (i.e. with a set statement). Rather, it should be enabled for the HiveServer2 instance "cluster wide", and not per user session. This could be due to how the Calcite optimizer is initialized or executed.
Created 01-06-2016 07:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS; ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;
Created 01-06-2016 02:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Andrew Grande, Thank you I saw that it was enabled in my settings
After running both ANALYZE queries, I attempt to execute an EXPLAIN on the query I'm trying to run.
-- Ran these first analyze table hvac_orc compute statistics; analyze table hvac_orc compute statistics for columns; -- Then ran this one EXPLAIN select buildingid, max(targettemp-actualtemp) from hvac_orc group by buildingid;
However when I do this and read the results, it still says CBO is not on, and I see this specific line in the results:
Statistics:Num rows: 4001 Data size: 116289 Basic stats: COMPLETE Column stats: NONE
Created 01-06-2016 02:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like the column stats aren't being saved.
Created 01-06-2016 03:48 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some queries won't engage CBO, simply because there's nothing for it to optimize in a plan. Are you observing slow performance?
Created 01-07-2016 01:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Performance really isn't slow when executing the query.
This is interesting. I figured that because the query had utilized the CBO in the tutorial I linked in the original question that it would still work now. I guess my thinking is incorrect?