Support Questions

Find answers, ask questions, and share your expertise

Error Configuring MySQL for Hive Metastore: column A0.SCHEMA_VERSION_V2 does not exist

avatar
New Contributor

I am trying to configure a pseudo cluster with both CM 5.13 and CHD 5.13 installed in the cluster nodes  (docker containers), but am having some problems getting the Hive service work.

Specifically the issue I have is that the Hive Metastore is not using my local MySQL database, but is instead defaulting to a non existent Postgresql database .

Here are the steps I took:

 

1) I configured Mysql server and a database for the metastore by following the official steps here:

https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_hive_metastore_configure.html...

And as described in the last step, I added the following properties to /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml:

 

(...) xml that was already there (...)

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://localhost/metastore</value>
  <description>the URL of the MySQL database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>mypassword</value>
</property>

<property>
  <name>datanucleus.autoCreateSchema</name>
  <value>false</value>
</property>

<property>
  <name>datanucleus.fixedDatastore</name>
  <value>true</value>
</property>

<property>
  <name>datanucleus.autoStartMechanism</name> 
  <value>SchemaTable</value>
</property> 

<property>
<name>hive.metastore.schema.verification</name>
<value>true</value>
</property>

And finally I restarted the CMS.

 

2) After this I tried to start the Hive Metastore from the CM  but it did not work, and after inspecting the stderr log I found the following error: 

+ exec /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/bin/hive --config /var/run/cloudera-scm-agent/process/188-hive-HIVEMETASTORE --service metastore -p 9083~

javax.jdo.JDOException: Exception thrown when executing query
	at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
....
org.postgresql.util.PSQLException: ERROR: column A0.SCHEMA_VERSION_V2 does not exist

And indeed, after checking 

/var/run/cloudera-scm-agent/process/188-hive-HIVEMETASTORE/hive-site.xml

 The first line of the file was: 

<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://headnode1.cluster:7432/hive</value>
</property>

 The error  "column A0.SCHEMA_VERSION_V2 does not exist", from what I could tell, is related to the metastore database schema not being correct, but in my Mysql metastore database I used:

SOURCE /usr/lib/hive/scripts/metastore/upgrade/mysql/hive-schema-1.1.0.mysql.sql;

 

So I believe the problem here is the fact that the CSM is ignoring the config file in /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml 

 

Any ideas on this issue?

Thank you very much.

1 ACCEPTED SOLUTION

avatar
Master Guru

Hello @mm25,

 

/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml is merely a stock file that is not used to start a service.  When you manage a cluster with Cloudera Manager, Cloudera Manager will initiate the start of the service.  It will use what you have configured in Cloudera Manager to generate all files necessary to start the process.  This includes hive-site.xml.

 

So, in short, the answer to why Hive is trying to use postgres is that you have it configured to do so in the configuration stored in Cloudera Manager.  To affect change you will need to change the configuration within Cloudera Manager's configuration.

 

- go to Clusters --> Hive

- Click on Configuration (subtab)

 

This will display your configuration.

- On the left side, Click on Hive Metastore Database to view all database-related configuraiton options.

- You can update the configuration and Save/Restart Hive to have the changes take effect.

 

If you want to view the generated configuration files, you can view them in a Role's Processes subtab or view them at the commandline in /var/run/cloudera-scm-agent/process

 

Each time a process is started by Cloudera Manager, a new process directory is created under that dir.

I generally use ls -lrt |grep hive or something of the sort to see the most recent process directory created for a service's roles.

 

Hope that does the trick!

 

View solution in original post

3 REPLIES 3

avatar
Master Guru

Hello @mm25,

 

/opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/lib/hive/conf/hive-site.xml is merely a stock file that is not used to start a service.  When you manage a cluster with Cloudera Manager, Cloudera Manager will initiate the start of the service.  It will use what you have configured in Cloudera Manager to generate all files necessary to start the process.  This includes hive-site.xml.

 

So, in short, the answer to why Hive is trying to use postgres is that you have it configured to do so in the configuration stored in Cloudera Manager.  To affect change you will need to change the configuration within Cloudera Manager's configuration.

 

- go to Clusters --> Hive

- Click on Configuration (subtab)

 

This will display your configuration.

- On the left side, Click on Hive Metastore Database to view all database-related configuraiton options.

- You can update the configuration and Save/Restart Hive to have the changes take effect.

 

If you want to view the generated configuration files, you can view them in a Role's Processes subtab or view them at the commandline in /var/run/cloudera-scm-agent/process

 

Each time a process is started by Cloudera Manager, a new process directory is created under that dir.

I generally use ls -lrt |grep hive or something of the sort to see the most recent process directory created for a service's roles.

 

Hope that does the trick!

 

avatar
New Contributor

Hi @bgooley,

Sorry about the late reply. But yup, that was exactly the issue, once I inserted the right configurations in Cloudera Manager it connected to the Mysql server correctly

Thank you for the help! 

avatar
Master Guru

Awesome!  Thank you for following up with the good news!