- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Error Configuring MySQL for Hive Metastore: column A0.SCHEMA_VERSION_V2 does not exist
- Labels:
-
Apache Hive
-
Cloudera Manager
Created on ‎01-11-2018 03:57 AM - edited ‎09-16-2022 05:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to configure a pseudo cluster with both CM 5.13 and CHD 5.13 installed in the cluster nodes (docker containers), but am having some problems getting the Hive service work.
Specifically the issue I have is that the Hive Metastore is not using my local MySQL database, but is instead defaulting to a non existent Postgresql database .
Here are the steps I took:
1) I configured Mysql server and a database for the metastore by following the official steps here:
And as described in the last step, I added the following properties to /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml:
(...) xml that was already there (...) <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost/metastore</value> <description>the URL of the MySQL database</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>mypassword</value> </property> <property> <name>datanucleus.autoCreateSchema</name> <value>false</value> </property> <property> <name>datanucleus.fixedDatastore</name> <value>true</value> </property> <property> <name>datanucleus.autoStartMechanism</name> <value>SchemaTable</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>true</value> </property>
And finally I restarted the CMS.
2) After this I tried to start the Hive Metastore from the CM but it did not work, and after inspecting the stderr log I found the following error:
+ exec /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/bin/hive --config /var/run/cloudera-scm-agent/process/188-hive-HIVEMETASTORE --service metastore -p 9083~ javax.jdo.JDOException: Exception thrown when executing query at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596) .... org.postgresql.util.PSQLException: ERROR: column A0.SCHEMA_VERSION_V2 does not exist
And indeed, after checking
/var/run/cloudera-scm-agent/process/188-hive-HIVEMETASTORE/hive-site.xml
The first line of the file was:
<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:postgresql://headnode1.cluster:7432/hive</value> </property>
The error "column A0.SCHEMA_VERSION_V2 does not exist", from what I could tell, is related to the metastore database schema not being correct, but in my Mysql metastore database I used:
SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-1.1.0.mysql.sql;
So I believe the problem here is the fact that the CSM is ignoring the config file in /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml
Any ideas on this issue?
Thank you very much.
Created ‎01-11-2018 01:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @mm25,
/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml is merely a stock file that is not used to start a service. When you manage a cluster with Cloudera Manager, Cloudera Manager will initiate the start of the service. It will use what you have configured in Cloudera Manager to generate all files necessary to start the process. This includes hive-site.xml.
So, in short, the answer to why Hive is trying to use postgres is that you have it configured to do so in the configuration stored in Cloudera Manager. To affect change you will need to change the configuration within Cloudera Manager's configuration.
- go to Clusters --> Hive
- Click on Configuration (subtab)
This will display your configuration.
- On the left side, Click on Hive Metastore Database to view all database-related configuraiton options.
- You can update the configuration and Save/Restart Hive to have the changes take effect.
If you want to view the generated configuration files, you can view them in a Role's Processes subtab or view them at the commandline in /var/run/cloudera-scm-agent/process
Each time a process is started by Cloudera Manager, a new process directory is created under that dir.
I generally use ls -lrt |grep hive or something of the sort to see the most recent process directory created for a service's roles.
Hope that does the trick!
Created ‎01-11-2018 01:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @mm25,
/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml is merely a stock file that is not used to start a service. When you manage a cluster with Cloudera Manager, Cloudera Manager will initiate the start of the service. It will use what you have configured in Cloudera Manager to generate all files necessary to start the process. This includes hive-site.xml.
So, in short, the answer to why Hive is trying to use postgres is that you have it configured to do so in the configuration stored in Cloudera Manager. To affect change you will need to change the configuration within Cloudera Manager's configuration.
- go to Clusters --> Hive
- Click on Configuration (subtab)
This will display your configuration.
- On the left side, Click on Hive Metastore Database to view all database-related configuraiton options.
- You can update the configuration and Save/Restart Hive to have the changes take effect.
If you want to view the generated configuration files, you can view them in a Role's Processes subtab or view them at the commandline in /var/run/cloudera-scm-agent/process
Each time a process is started by Cloudera Manager, a new process directory is created under that dir.
I generally use ls -lrt |grep hive or something of the sort to see the most recent process directory created for a service's roles.
Hope that does the trick!
Created ‎01-17-2018 09:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @bgooley,
Sorry about the late reply. But yup, that was exactly the issue, once I inserted the right configurations in Cloudera Manager it connected to the Mysql server correctly
Thank you for the help!
Created ‎01-17-2018 10:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Awesome! Thank you for following up with the good news!
