Support Questions

Find answers, ask questions, and share your expertise

Unable to start services through AMBARI UI

avatar
Explorer

I have a cluster of 10 nodes managed with Ambari 2.2.1. The config DB is MySQL and the HDP version is 2.4.0. After space issue on the DB filesystem, i'm unable to restart or stop any service through the UI. Even after the cluster restart, i'm still unable to start services. All requests stuck and i can't abort them. Can someone help me?

1 ACCEPTED SOLUTION

avatar
Master Mentor

Please restore a backup of the database to a new larger partition and restart Ambari server. cleanup of db concerns me.

View solution in original post

8 REPLIES 8

avatar

space issue on the db. what is that, have you resolved that and then trying to restart?

avatar
Explorer

Thank you Deepak. I mean the configuration database was on a small partition disk which got full. So mysql shut down. We made a cleanup and restarted mysql and the Ambani server. After that nothing works normally.

avatar
Explorer

Thank you Deepak. I mean the configuration database was on a small partition disk which got full. So mysql shut down. We made a cleanup and restarted mysql and the Ambani server. After that nothing works normally.

avatar
Master Guru

@Samie WALA can you post what the ambari log is spitting out when you try to restart service?

avatar
Explorer

@Sunile Manjee Yes the ambari-server.log file contains a lot of information. Among them, some tables crashed and needed to be repaired. That is what i did first. After taht i'm seeing this java error:

15 Jul 2016 20:11:34,776  WARN [ambari-action-scheduler] ActionScheduler:200 - Exception received
java.lang.RuntimeException: Invalid DB state, broken one-to-one relation for taskId=30710
	at org.apache.ambari.server.actionmanager.HostRoleCommand.getExecutionCommandWrapper(HostRoleCommand.java:371)
	at org.apache.ambari.server.actionmanager.Stage.loadExecutionCommandWrappers(Stage.java:216)
	at org.apache.ambari.server.actionmanager.Stage.checkWrappersLoaded(Stage.java:203)
	at org.apache.ambari.server.actionmanager.Stage.getExecutionCommands(Stage.java:595)
	at org.apache.ambari.server.actionmanager.ActionScheduler.isStageHasBackgroundCommandsOnly(ActionScheduler.java:529)
	at org.apache.ambari.server.actionmanager.ActionScheduler.filterParallelPerHostStages(ActionScheduler.java:485)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:251)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:195)
	at java.lang.Thread.run(Thread.java:745)


avatar
Master Guru

@Samie WALA I have not seen this before. I need to do a quick dive into the code. until then @Artem Ervits have you seen this?

avatar
Master Mentor

Please restore a backup of the database to a new larger partition and restart Ambari server. cleanup of db concerns me.

avatar
Explorer

I finally solved the issue by traking the taskID 30170 in the ambari DB. I found that there is an inconsistancy due to the DB crash. The tables "execution_command" and "host_role_command" were those containing the reference to this taskID while the table "task" has no reference to it. These twotables were those i repaired after the DB crashed.

After cleaning up these tables i was able to restart all services.

But to avoid any other related problem that could occur later, i restored an older ambari DB as @Artem Ervits suggested.

Now all services are running fine.

Thank you all for your help.