Support Questions

Find answers, ask questions, and share your expertise

Possible to Utilize CBO in the Hive view?

avatar
Expert Contributor

I'm following a tutorial on the Hortonworks site, http://hortonworks.com/hadoop-tutorial/superchargi...

I'm wondering if it is possible to get the CBO working. I've tried setting the 4 params and getting the table stats, but for some reason I'm getting an error which looks like:

   org.apache.ambari.view.hive.client.HiveErrorStatusException: H110 Unable to submit statement. Error while processing statement: Cannot modify set hive.cbo.enable at runtime. It is not in list of params that are allowed to be modified at runtime [ERROR_STATUS]

I get the error across 3 different params. How do I get the CBO working?

1 ACCEPTED SOLUTION

avatar

The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.

ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS;
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;

View solution in original post

8 REPLIES 8

avatar

In recent HDP versions you don't have to explicitly enable it, it should be on by default already. You can check via Ambari -> Hive -> Configs. It will have the CBO switch like below:

1198-screenshot.png

avatar
Expert Contributor

Thanks Andrew! I did not know that. I probably should have done some digging myself to figure that out. However I'm still having issue when I run an EXPLAIN and it says my query is not optimized by CBO. See my comment to @Deepesh's response

avatar

The error message seems to indicate that you cannot simply enable CBO for your session or connection (i.e. with a set statement). Rather, it should be enabled for the HiveServer2 instance "cluster wide", and not per user session. This could be due to how the Calcite optimizer is initialized or executed.

avatar

The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.

ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS;
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;

avatar
Expert Contributor

@Andrew Grande, Thank you I saw that it was enabled in my settings

After running both ANALYZE queries, I attempt to execute an EXPLAIN on the query I'm trying to run.

-- Ran these first
analyze table hvac_orc compute statistics;
analyze table hvac_orc compute statistics for columns;

-- Then ran this one
EXPLAIN select buildingid, max(targettemp-actualtemp)  from hvac_orc group  by buildingid;

However when I do this and read the results, it still says CBO is not on, and I see this specific line in the results:

Statistics:Num rows: 4001 Data size: 116289 Basic stats: COMPLETE Column stats: NONE

avatar
Expert Contributor

It looks like the column stats aren't being saved.

avatar

Some queries won't engage CBO, simply because there's nothing for it to optimize in a plan. Are you observing slow performance?

avatar
Expert Contributor

Performance really isn't slow when executing the query.

This is interesting. I figured that because the query had utilized the CBO in the tutorial I linked in the original question that it would still work now. I guess my thinking is incorrect?