Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Possible to Utilize CBO in the Hive view?

Solved Go to solution

Possible to Utilize CBO in the Hive view?

Rising Star

I'm following a tutorial on the Hortonworks site, http://hortonworks.com/hadoop-tutorial/superchargi...

I'm wondering if it is possible to get the CBO working. I've tried setting the 4 params and getting the table stats, but for some reason I'm getting an error which looks like:

   org.apache.ambari.view.hive.client.HiveErrorStatusException: H110 Unable to submit statement. Error while processing statement: Cannot modify set hive.cbo.enable at runtime. It is not in list of params that are allowed to be modified at runtime [ERROR_STATUS]

I get the error across 3 different params. How do I get the CBO working?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Possible to Utilize CBO in the Hive view?

Master Collaborator

The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.

ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS;
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;

View solution in original post

8 REPLIES 8
Highlighted

Re: Possible to Utilize CBO in the Hive view?

In recent HDP versions you don't have to explicitly enable it, it should be on by default already. You can check via Ambari -> Hive -> Configs. It will have the CBO switch like below:

1198-screenshot.png

Highlighted

Re: Possible to Utilize CBO in the Hive view?

Rising Star

Thanks Andrew! I did not know that. I probably should have done some digging myself to figure that out. However I'm still having issue when I run an EXPLAIN and it says my query is not optimized by CBO. See my comment to @Deepesh's response

Highlighted

Re: Possible to Utilize CBO in the Hive view?

The error message seems to indicate that you cannot simply enable CBO for your session or connection (i.e. with a set statement). Rather, it should be enabled for the HiveServer2 instance "cluster wide", and not per user session. This could be due to how the Calcite optimizer is initialized or executed.

Highlighted

Re: Possible to Utilize CBO in the Hive view?

Master Collaborator

The error message is being thrown by Hive SQLStandardAuth authorization mode which disallows hive.cbo.enable to be set in the HS2 client session. As @Andrew Grande mentioned CBO should be enabled by default. For CBO to get exercised make sure stats are generated for the table in question.

ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS;
ANALYZE TABLE <table> [partion(key)] COMPUTE STATISTICS FOR COLUMNS;

View solution in original post

Highlighted

Re: Possible to Utilize CBO in the Hive view?

Rising Star

@Andrew Grande, Thank you I saw that it was enabled in my settings

After running both ANALYZE queries, I attempt to execute an EXPLAIN on the query I'm trying to run.

-- Ran these first
analyze table hvac_orc compute statistics;
analyze table hvac_orc compute statistics for columns;

-- Then ran this one
EXPLAIN select buildingid, max(targettemp-actualtemp)  from hvac_orc group  by buildingid;

However when I do this and read the results, it still says CBO is not on, and I see this specific line in the results:

Statistics:Num rows: 4001 Data size: 116289 Basic stats: COMPLETE Column stats: NONE
Highlighted

Re: Possible to Utilize CBO in the Hive view?

Rising Star

It looks like the column stats aren't being saved.

Highlighted

Re: Possible to Utilize CBO in the Hive view?

Some queries won't engage CBO, simply because there's nothing for it to optimize in a plan. Are you observing slow performance?

Highlighted

Re: Possible to Utilize CBO in the Hive view?

Rising Star

Performance really isn't slow when executing the query.

This is interesting. I figured that because the query had utilized the CBO in the tutorial I linked in the original question that it would still work now. I guess my thinking is incorrect?

Don't have an account?
Coming from Hortonworks? Activate your account here