Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

App Timeline server duplicated (sort of) after failed move

avatar
Explorer

I'm running ambari 2.1.0. I tried to move the App Timeline Server (ATS) and the process failed (for various reasons).

I was able to bring ambari back up, but now it seems to think it has two ATS masters. In order to get things operational, I've put one of them (the one I was trying to move the service to) in maintenance mode and started the original one. Now I have a permanent alert on the first host that it failed to connect to the ATS on the second host.

From the dashboard, YARN appears to be up/operational but I'm not sure if it is or not.

Any suggestions on how I might be able to untangle this?

1 ACCEPTED SOLUTION

avatar
Contributor

Thanks for sharing the output. Yes, that's exactly what I meant (REST API call to get the ATS instances registered with Ambari).

To delete the bad ATS instance from Ambari, you can issue the following API call:

curl -u admin:admin -k -H "X-Requested-By: ambari" -X DELETE https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/<hostname-with-bad-ATS>/host_components/APP_T...

View solution in original post

17 REPLIES 17

avatar
Contributor

Hi Wayne.

What's the output of /api/v1/clusters/<cluster_name>/host_components?HostRoles/component_name=APP_TIMELINE_SERVER

Do you see two ATS instances in the output?

avatar
Explorer

I cannot seem to find that command.

avatar
Explorer

However, I might have gotten what you want in a different manner:

[root@cg-hm08 ~]# curl -i -uadmin:<> -k -H "X-Requested-By: ambari" -d '{"HostRoles": { "state": "STARTED"}}' -X GET 'https://localhost:8443/api/v1/clusters/ROGERGPFS/host_components?HostRoles/component_name=APP_TIMELINE_SERVER'

HTTP/1.1 200 OK

User: admin

Set-Cookie: AMBARISESSIONID=1jmupkh1wyuoo8n3d84vc8zsc;Path=/;Secure;HttpOnly

Expires: Thu, 01 Jan 1970 00:00:00 GMT

Content-Type: text/plain

Vary: Accept-Encoding, User-Agent

Content-Length: 1028

Server: Jetty(8.1.17.v20150415)

{

"href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/host_components?HostRoles/component_name=APP_TIMELINE_SERVER",

"items" : [

{

"href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu/host_components/APP_TIMELINE_SERVER",

"HostRoles" : {

"cluster_name" : "ROGERGPFS",

"component_name" : "APP_TIMELINE_SERVER",

"host_name" : "cg-hm09.ncsa.illinois.edu"

},

"host" : {

"href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu"

}

},

{

"href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm11.ncsa.illinois.edu/host_components/APP_TIMELINE_SERVER",

"HostRoles" : {

"cluster_name" : "ROGERGPFS",

"component_name" : "APP_TIMELINE_SERVER",

"host_name" : "cg-hm11.ncsa.illinois.edu"

},

"host" : {

"href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm11.ncsa.illinois.edu"

}

}

]

}

avatar
Contributor

Thanks for sharing the output. Yes, that's exactly what I meant (REST API call to get the ATS instances registered with Ambari).

To delete the bad ATS instance from Ambari, you can issue the following API call:

curl -u admin:admin -k -H "X-Requested-By: ambari" -X DELETE https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/<hostname-with-bad-ATS>/host_components/APP_T...

avatar
Explorer

Yeah. When I screw something up, I don't seem to do it by half measures.

avatar
Explorer

#curl -u admin:<> -k -H "X-Requested-By: ambari" -X DELETE https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu/host_components/APP...

{

"status" : 500,

"message" : "org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Host Component cannot be removed, clusterName=ROGERGPFS, serviceName=YARN, componentName=APP_TIMELINE_SERVER, hostname=cg-hm09.ncsa.illinois.edu, request={ clusterName=ROGERGPFS, serviceName=YARN, componentName=APP_TIMELINE_SERVER, hostname=cg-hm09.ncsa.illinois.edu, desiredState=null, state=null, desiredStackId=null, staleConfig=null, adminState=null}"

}

avatar
Contributor

🙂 Can you get the corresponding stack trace from the server log? It's at /var/log/ambari-server/ambari-server.log

avatar
Explorer

sorry if you saw the rest of the activity on this response. First, I posted the resultant output, but I did it as an answer. Given that it wasn't, I deleted it noticing only too late that I could have converted it to a comment. The next time I attempted to add the output, I managed to grab the wrong copy buffer (and wound up duplicating the original informational output from the GET command. So, I deleted that. Sigh.

avatar
Contributor

No worries. If you can provide the stack trace from the ambari-server.log file I should be able to help you further.

avatar
Explorer

Remember, you asked for it. 🙂

20 Jan 2017 15:27:44,017 ERROR [qtp-client-37739] AbstractResourceProvider:338 - Caught AmbariException when modifying a resource

org.apache.ambari.server.AmbariException: Host Component cannot be removed, clusterName=ROGERGPFS, serviceName=YARN, componentName=APP_TIMELINE_SERVER, hostname=cg-hm09.ncsa.illinois.edu, request={ clusterName=ROGERGPFS, serviceName=YARN, componentName=APP_TIMELINE_SERVER, hostname=cg-hm09.ncsa.illinois.edu, desiredState=null, state=null, desiredStackId=null, staleConfig=null, adminState=null}

at org.apache.ambari.server.controller.AmbariManagementControllerImpl.deleteHostComponents(AmbariManagementControllerImpl.java:2731)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$3.invoke(HostComponentResourceProvider.java:321)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider$3.invoke(HostComponentResourceProvider.java:318)

at org.apache.ambari.server.controller.internal.AbstractResourceProvider.modifyResources(AbstractResourceProvider.java:331)

at org.apache.ambari.server.controller.internal.HostComponentResourceProvider.deleteResources(HostComponentResourceProvider.java:318)

at org.apache.ambari.server.controller.internal.ClusterControllerImpl.deleteResources(ClusterControllerImpl.java:330)

at org.apache.ambari.server.api.services.persistence.PersistenceManagerImpl.delete(PersistenceManagerImpl.java:111)

at org.apache.ambari.server.api.handlers.DeleteHandler.persist(DeleteHandler.java:44)

at org.apache.ambari.server.api.handlers.BaseManagementHandler.handleRequest(BaseManagementHandler.java:72)

at org.apache.ambari.server.api.services.BaseRequest.process(BaseRequest.java:135)

at org.apache.ambari.server.api.services.BaseService.handleRequest(BaseService.java:105)

at org.apache.ambari.server.api.services.BaseService.handleRequest(BaseService.java:74)

at org.apache.ambari.server.api.services.HostComponentService.deleteHostComponent(HostComponentService.java:203)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)

at com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)

at com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)

at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:302)

at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)

at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:137)

at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)

at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)

at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)

at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1542)

at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1473)

at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)

at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)

at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)

at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:540)

at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:715)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1496)

{continued}

avatar
Contributor

Ok, the ATS instance that you are trying to delete is in one of the states that makes it non-deletable.

Can you get me the output of:

curl -uadmin:PW -k https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu/host_components/APP...

avatar
Contributor

@Wayne Hoyenga

Ok, you just need to issue Stop on that ATS instance on cg-hm09. Please go to Hosts -> cg-hm09 and choose Stop on App Timeline Server from the component list. Then try the delete API call again.

avatar
Contributor

@Wayne Hoyenga

Did the unwanted alerts disappear too?

avatar
Explorer

Yea! That worked. Thank you.

avatar
Explorer

{ "href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu/host_components/APP_TIMELINE_SERVER", "HostRoles" : { "cluster_name" : "ROGERGPFS", "component_name" : "APP_TIMELINE_SERVER", "desired_stack_id" : "HDP-2.3", "desired_state" : "STARTED", "hdp_version" : "HDP-2.3.2.0-2602", "host_name" : "cg-hm09.ncsa.illinois.edu", "maintenance_state" : "ON", "service_name" : "YARN", "stack_id" : "HDP-2.3", "stale_configs" : false, "state" : "STARTED", "upgrade_state" : "NONE", "actual_configs" : { "accumulo-env" : { "default" : "version1460667914523" }, "accumulo-log4j" : { "default" : "version1460667914523" }, "accumulo-site" : { "default" : "version1460667914523" }, "ams-env" : { "default" : "version1" }, "ams-hbase-env" : { "default" : "version1" }, "ams-hbase-log4j" : { "default" : "version1" }, "ams-hbase-policy" : { "default" : "version1" }, "ams-hbase-security-site" : { "default" : "version1" }, "ams-hbase-site" : { "default" : "version1467740762357" }, "ams-log4j" : { "default" : "version1" }, "ams-site" : { "default" : "version1467740762357" }, "capacity-scheduler" : { "default" : "version1" }, "client" : { "default" : "version1460667914523" }, "cluster-env" : { "default" : "version1" }, "core-site" : { "default" : "version1484279982036" }, "falcon-env" : { "default" : "version1" }, "falcon-runtime.properties" : { "default" : "version1" }, "falcon-startup.properties" : { "default" : "version1" }, "gateway-log4j" : { "default" : "version1" }, "gateway-site" : { "default" : "version1" }, "hadoop-env" : { "default" : "version1" }, "hadoop-policy" : { "default" : "version1" }, "hbase-env" : { "default" : "version1" }, "hbase-log4j" : { "default" : "version1" }, "hbase-policy" : { "default" : "version1" }, "hbase-site" : { "default" : "version1" }, "hcat-env" : { "default" : "version1" }, "hdfs-log4j" : { "default" : "version1" }, "hdfs-site" : { "default" : "version1484262327604" }, "hive-env" : { "default" : "version1484280941838" }, "hive-exec-log4j" : { "default" : "version1" }, "hive-log4j" : { "default" : "version1" }, "hive-site" : { "default" : "version1484280941838" }, "hiveserver2-site" : { "default" : "version1" }, "knox-env" : { "default" : "version1" }, "ldap-log4j" : { "default" : "version1" }, "mapred-env" : { "default" : "version1" }, "mapred-site" : { "default" : "version1" }, "oozie-env" : { "default" : "version1442008913821" }, "oozie-log4j" : { "default" : "version1" }, "oozie-site" : { "default" : "version1484271458416" }, "pig-env" : { "default" : "version1" }, "pig-log4j" : { "default" : "version1" }, "pig-properties" : { "default" : "version1" }, "ranger-hbase-audit" : { "default" : "version1" }, "ranger-hbase-plugin-properties" : { "default" : "version1" }, "ranger-hbase-policymgr-ssl" : { "default" : "version1" }, "ranger-hbase-security" : { "default" : "version1" }, "ranger-hdfs-audit" : { "default" : "version1" }, "ranger-hdfs-plugin-properties" : { "default" : "version1" }, "ranger-hdfs-policymgr-ssl" : { "default" : "version1" }, "ranger-hdfs-security" : { "default" : "version1" }, "ranger-hive-audit" : { "default" : "version1" }, "ranger-hive-plugin-properties" : { "default" : "version1" }, "ranger-hive-policymgr-ssl" : { "default" : "version1" }, "ranger-hive-security" : { "default" : "version1" }, "ranger-knox-audit" : { "default" : "version1" }, "ranger-knox-plugin-properties" : { "default" : "version1" }, "ranger-knox-policymgr-ssl" : { "default" : "version1" }, "ranger-knox-security" : { "default" : "version1" }, "ranger-yarn-audit" : { "default" : "version1" }, "ranger-yarn-plugin-properties" : { "default" : "version1" }, "ranger-yarn-policymgr-ssl" : { "default" : "version1" }, "ranger-yarn-security" : { "default" : "version1" }, "spark-defaults" : { "default" : "version1" }, "spark-env" : { "default" : "version1" }, "spark-javaopts-properties" : { "default" : "version1" }, "spark-log4j-properties" : { "default" : "version1" }, "spark-metrics-properties" : { "default" : "version1" }, "sqoop-env" : { "default" : "version1" }, "ssl-client" : { "default" : "version1" }, "ssl-server" : { "default" : "version1" }, "tez-env" : { "default" : "version1" }, "tez-site" : { "default" : "version1" }, "topology" : { "default" : "version1" }, "users-ldif" : { "default" : "version1" }, "webhcat-env" : { "default" : "version1" }, "webhcat-log4j" : { "default" : "version1" }, "webhcat-site" : { "default" : "version1484280941838" }, "yarn-env" : { "default" : "version1484269362684" }, "yarn-log4j" : { "default" : "version1" }, "yarn-site" : { "default" : "version1484278518158" }, "zoo.cfg" : { "default" : "version1" }, "zookeeper-env" : { "default" : "version1" }, "zookeeper-log4j" : { "default" : "version1" } } }, "host" : { "href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/hosts/cg-hm09.ncsa.illinois.edu" }, "component" : [ { "href" : "https://localhost:8443/api/v1/clusters/ROGERGPFS/services/YARN/components/APP_TIMELINE_SERVER", "ServiceComponentInfo" : { "cluster_name" : "ROGERGPFS", "component_name" : "APP_TIMELINE_SERVER", "service_name" : "YARN" } } ], "processes" : [ ] }

avatar
Explorer

@yusaku, yes, they did.

At some point, when it happens again, I may start another thread regarding why my ambari metrics server keeps dying. Its fairly annoying.

avatar
Contributor

Awesome. If you could "accept" my answer, that would be great. AMS crashing issue might be due to the version of Ambari you are using (2.1.0 is quite old and there have been numerous stability improvements on AMS since then). If it is possible, I highly recommend you upgrade to Ambari 2.4.2.

Labels