Created on 12-20-2017 06:14 AM - edited 09-16-2022 05:39 AM
Hello,
please advise is it possible to upgrade hive to 1.2.0 or higher on Cloudera cluster?
Thanks,
Maxim
Created 12-29-2017 03:54 AM
Hi, everybody!
I'm new to cloudera, but i'm significantly surprised that cloudera's hadoop distribution doesn't support versions of hive later than 1.1.0.
Very many changes were done since this version, that affects performance, support of SQL commands (UNION inspite of UNION ALL) and etc.
Maybe there is something that can be used insted of Hive to store and manipulate data in SQL-way?
I'm asking because i can't believe that Cloudera can't include latest version of Hive, and i think that some other solution is used for this purposes.
Best regards, Daniil.
Created 01-02-2018 02:51 AM
Created 01-08-2018 11:29 PM
Hi, EricL.
I'v read about the things you'v written before. And i hope that all innovations and optimizations done by hive developers are applied in hive distributed with cloudera.
But still.
select * from clickstream_csv union select * FROM clickstream_bad LIMIT 100;
returns
Error while compiling statement: FAILED: ParseException line 3:0 missing ALL at 'select' near '<EOF>'
So. The union statement can not be used. For sure.
And this makes me to doubt about inclusion of changes done by Hive developers since version 1.1
My current cloudera distribution is 5.12.1.
With best regards, Daniil.
Created 01-09-2018 08:13 AM
Daniil,
Hive 1.1 (CDH 5.4+) only offers UNION ALL (bag union), in which duplicate rows are not eliminated. Starting with Hive 1.2, the UNION DISTINCT feature was introduced and if no UNION type was explictly specified, the default UNION operation is DISTINCT. However, with the introduction of this new UNION DISTINCT capability came some other subtle changes to how the UNION ALL feature worked. We are unable to introduce those changes into CDH 5 for risk of affecting existing workloads. It will be available in CDH 6.
In CDH 5, there is only support UNION is UNION ALL. If it fulfils your business requirements, please include the ALL statement.
select * from clickstream_csv UNION ALL select * FROM clickstream_bad LIMIT 100;
You may then pass it through a DISTINCT clause to achieve the same affect.
select distinct(salary) from ( select salary from sample_07 union ALL select salary from sample_08) z;
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union
Thanks.
Created 01-09-2018 10:27 PM
Hello, David.
Thanks for your response.
I knew the difference between UNION and UNION ALL, and how to eliminate duplicates using UNION ALL statement combined with DISTINCT statement.
New thing for me is that you are going to use newer version in CDH 6. That's good. Looking forward to it.
Can i check somewhere CDH release roadmap?
Thanks.
Created 01-10-2018 06:04 AM
Daniil,
We do not have a publicly available roadmap for CDH 6 yet. And while nothing is final until it final, I think it's safe to say that we will be upgrading to at least Hive 1.2, which includes this requested feature.
Thanks.
Created 01-02-2018 02:44 AM
Created 05-09-2018 01:12 AM
Need Hive 2 for ACID
Created on 06-21-2018 12:46 PM - edited 06-21-2018 12:55 PM
I have a similar problem using my data modeling tool (erwin). CDH 5.15 is unable to process queries generated by erwin which supports Hive 2.1. When I attempt to reverse engineer my MySQL metastore, the following errors are returned:
--This file is published using to trace the exected SQL
--ERwin RE SQL Trace for Hive 2.1.x started on 2018-06-21 10:23:42
--[2018-06-21 10:23:42] Model DBMSMetastoreMySQLVersion
--[2018-06-21 10:23:43] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..VERSION
--[2018-06-21 10:23:43] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..VERSION
--[2018-06-21 10:23:43] Entity ObjectsMetastoreMySQL
--[2018-06-21 10:23:43] Entity DatabaseFilterMetastoreMySQL
--[2018-06-21 10:23:43] Entity TableFilterMetastoreMySQL
--[2018-06-21 10:23:44] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..TBLS
--[2018-06-21 10:23:44] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..TBLS
--[2018-06-21 10:23:44] Model DBMSMetastoreMySQLVersion
--[2018-06-21 10:23:45] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..VERSION
--[2018-06-21 10:23:45] Database error: [Cloudera][SQLEngine] (31740) Table or view not found: HIVE..VERSION
--ERwin RE SQL Trace for Hive 2.1.x ended on 2018-06-21 10:23:45
When will Cloudera support the required queries so that I can successfully reverse and forward engineer to my MySQL metastore database using erwin?
***NEIL***