I have upgraded from CDH 5.4.5 to 5.7.0 on a test deployment, to prepare for production upgrade.
After the upgrade to 5.7.0 no CM services could be started (event server, host monitor, etc.). They have timed out after 150 seconds and I was unable to find any error messages apart from this or execute those services manually to get closer to the problem. Then I have decided to downgrade back to 5.4.5, but then it complained that there's no such version as "el7". I though it would solve the problem if remove those lines from parcel, and parcel_components tables from the scm database, but it became worse. Now I'm getting this error, and I'm stuck. It seems that there's still something left behind from 5.7.0, or something cached somewhere.
Path: http://bdt001:7180/cmf/parcel/topLevelCount Version: Cloudera Express 5.4.5 (#5 built by jenkins on 20150728-0320 git: 0f47e1327ab87d49c6f504fe3e09e4022fa63ba1) com.google.common.util.concurrent.UncheckedExecutionException:org.hibernate.PropertyAccessException: Exception occurred inside setter of com.cloudera.cmf.model.DbProcess.resourcesForDb at LocalCache.java line 2263 in com.google.common.cache.LocalCache$Segment get() Stack Trace: LocalCache.java line 2263 in com.google.common.cache.LocalCache$Segment get() LocalCache.java line 4000 in com.google.common.cache.LocalCache get() LocalCache.java line 4004 in com.google.common.cache.LocalCache getOrLoad() LocalCache.java line 4874 in com.google.common.cache.LocalCache$LocalLoadingCache get() ParcelActiveStatusProviderImpl.java line 68 in com.cloudera.parcel.components.ParcelActiveStatusProviderImpl getParcelActiveStatus() ClusterParcelStatus.java line 480 in com.cloudera.parcel.ClusterParcelStatus of() ParcelManagerImpl.java line 606 in com.cloudera.parcel.components.ParcelManagerImpl getParcelStatus() ParcelManagerImpl.java line 597 in com.cloudera.parcel.components.ParcelManagerImpl getActionCount() ParcelManagerImpl.java line 70 in com.cloudera.parcel.components.ParcelManagerImpl access$100() ParcelManagerImpl.java line 123 in com.cloudera.parcel.components.ParcelManagerImpl$1 get()
I would appreciate any idea to solve this.
I'm using Oracle Java 1.7.0_75-b13, on CentOS 6 servers.
Created 05-04-2016 12:37 PM
I have managed to sort things out, as far as the downgrade.
First, I had rebooted every node in the cluster to stop every running process, to start from a clean state.
Second, I have cleared scm database's process and process_active_releases tables.
After that CM was usable at last.
HostMonitor still had an issue with a different LevelDB schema, so I had to clear it's database in /var/lib/cloudera-host-monitor.
After that I just had to restart every service.
Created 05-04-2016 12:37 PM
I have managed to sort things out, as far as the downgrade.
First, I had rebooted every node in the cluster to stop every running process, to start from a clean state.
Second, I have cleared scm database's process and process_active_releases tables.
After that CM was usable at last.
HostMonitor still had an issue with a different LevelDB schema, so I had to clear it's database in /var/lib/cloudera-host-monitor.
After that I just had to restart every service.
Created 05-04-2016 12:40 PM
If you're in doubt how to do scm database operations:
cd /var/lib/cloudera-scm-server-db/data/ export PGPASSWORD=`head -1 generated_password.txt` sudo -Eu cloudera-scm psql -p 7432 scm scm=# delete from process_active_releases; scm=# delete from processes;