Member since
03-01-2016
17
Posts
10
Kudos Received
0
Solutions
03-12-2017
07:25 PM
Apparently we're hitting the Application History server fairly often and lately this has caused it to crash 2017-03-12 14:04:38,198 ERROR mortbay.log (Slf4jLog.java:warn(87)) - Error for /applicationhistory java.lang.OutOfMemoryError: GC overhead limit exceeded 2017-03-12 14:04:38,198 FATAL yarn.YarnUncaughtExceptionHandler (YarnUncaughtExceptionHandler.java:uncaughtException(51)) - Thread Thread[timeline,5,main] threw an Error. Shutting down now... java.lang.OutOfMemoryError: GC overhead limit exceeded 2017-03-12 14:04:38,198 INFO applicationhistoryservice.FileSystemApplicationHistoryStore (FileSystemApplicationHistoryStore.java:getApplication(189)) - Completed reading history information of application application_1478290235897_0046 2017-03-12 14:04:38,201 INFO util.ExitUtil (ExitUtil.java:halt(147)) - Halt with status -1 Message: HaltException
... View more
12-21-2016
11:26 PM
1 Kudo
I found and fixed the issue. When ambari-metrics was installed originally it was ambari 2.2.1.0 ... I installed 2.2.2.0 and upgraded ambari, everywhere, but it looks as though the metrics collector was never upgraded.. I just upgraded it and restarted the service and I'm seeing the data now.
... View more
12-21-2016
10:20 PM
{"timestamp":0,"starttime":0,"metrics":{}}
It'a a large log, how many lines would you like. Do you want to see INFO,WARN or ??
Basically the info looks like this : 2016-12-21 17:18:52,910 INFO TimelineMetricHostAggregatorMinute: Last check point time: 1482358431510, lagBy: 301 seconds. 2016-12-21 17:18:52,910 INFO TimelineMetricHostAggregatorMinute: Start aggregation cycle @ Wed Dec 21 17:18:52 EST 2016, startTime = Wed Dec 21 17:13:51 EST 2016, endTime = Wed Dec 21 17:18:51 EST 2016 2016-12-21 17:18:54,623 INFO TimelineClusterAggregatorMinute: Last check point time: 1482358431513, lagBy: 303 seconds. 2016-12-21 17:18:54,623 INFO TimelineClusterAggregatorMinute: Start aggregation cycle @ Wed Dec 21 17:18:54 EST 2016, startTime = Wed Dec 21 17:13:51 EST 2016, endTime = Wed Dec 21 17:18:51 EST 2016 2016-12-21 17:18:55,828 INFO TimelineClusterAggregatorMinute: 1577 row(s) updated. 2016-12-21 17:18:55,829 INFO TimelineClusterAggregatorMinute: Aggregated cluster metrics for METRIC_AGGREGATE_MINUTE, with startTime = Wed Dec 21 17:13:51 EST 2016, endTime = Wed Dec 21 17:18:51 EST 2016 2016-12-21 17:18:55,829 INFO TimelineClusterAggregatorMinute: End aggregation cycle @ Wed Dec 21 17:18:55 EST 2016 2016-12-21 17:18:55,829 INFO TimelineClusterAggregatorMinute: End aggregation cycle @ Wed Dec 21 17:18:55 EST 2016 2016-12-21 17:18:56,262 INFO TimelineClusterAggregatorSecond: Last check point time: 1482358514752, lagBy: 221 seconds. 2016-12-21 17:18:56,262 INFO TimelineClusterAggregatorSecond: Start aggregation cycle @ Wed Dec 21 17:18:56 EST 2016, startTime = Wed Dec 21 17:15:14 EST 2016, endTime = Wed Dec 21 17:17:14 EST 2016 2016-12-21 17:18:56,361 INFO TimelineMetricHostAggregatorMinute: 5317 row(s) updated. 2016-12-21 17:18:56,361 INFO TimelineMetricHostAggregatorMinute: Aggregated host metrics for METRIC_RECORD_MINUTE, with startTime = Wed Dec 21 17:13:51 EST 2016, endTime = Wed Dec 21 17:18:51 EST 2016 2016-12-21 17:18:56,361 INFO TimelineMetricHostAggregatorMinute: End aggregation cycle @ Wed Dec 21 17:18:56 EST 2016 2016-12-21 17:18:56,361 INFO TimelineMetricHostAggregatorMinute: End aggregation cycle @ Wed Dec 21 17:18:56 EST 2016 2016-12-21 17:18:56,910 INFO TimelineClusterAggregatorSecond: Saving 9485 metric aggregates. 2016-12-21 17:18:58,111 INFO TimelineClusterAggregatorSecond: End aggregation cycle @ Wed Dec 21 17:18:58 EST 2016 2016-12-21 17:18:58,112 INFO TimelineClusterAggregatorSecond: End aggregation cycle @ Wed Dec 21 17:18:58 EST 2016
... View more
12-21-2016
10:11 PM
@Aravindan Vijayan All of the metrics components are up and running. The dashboards I see the failure on are all for HDFS (HDFS - DataNodes, HDFS - Home, HDFS - NameNodes), and YARN (YARN - NodeManagers, YARN - Queues, YARN - ResourceManager)
... View more
12-21-2016
05:49 PM
I just installed Grafana and while attempting to look at some of the canned dashboards I see the following error : Dashboard init failed
Template variables could not be initialized: comp.forEach is not a function. (In 'comp.forEach', 'comp.forEach' is undefined) I'm also unable to see data per host.
... View more
Labels:
12-13-2016
10:57 PM
Yes, I'm well aware of this document. I'm suggesting that this be changed if it can't happen now. This is a fairly major security flaw. Users should NEVER have to be given ADMIN privileges to change their password. Is there a way to bypass or change the config to delegate users the ability to change their password?
... View more
12-13-2016
10:28 PM
2 Kudos
When users are added to Ambari they are not "admins" I have to give them temporary passwords so that they can log in. How do I allow them to change that temporary password w/o being an admin?
... View more
08-25-2016
09:08 PM
Is there any update to this? I'm seeing this issue on a new install.
... View more
03-03-2016
06:18 PM
1 Kudo
Tried that, the migration failed and ambari wouldn't start. We were able to get past this by removing specific ranger keys from ambari's PGSQL db and was able to update the schema and continue the upgrade.
... View more
03-01-2016
09:02 PM
1 Kudo
Not yet.. We're working on it
... View more
03-01-2016
08:48 PM
1 Kudo
I'm seeing this in the log [root@bodcdevvhdp104 tmp]# !cat
cat /var/log/ambari-server/ambari-server.out
[EL Warning]: metadata: 2016-03-01 15:47:35.985--ServerSession(1327476372)--The reference column name [resource_type_id] mapped on the element [field permissions] does not correspond to a valid id or basic field/column on the mapping reference. Will use referenced column name as provided.
[EL Info]: 2016-03-01 15:47:38.254--ServerSession(1327476372)--EclipseLink, version: Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd
[EL Info]: connection: 2016-03-01 15:47:38.597--ServerSession(1327476372)--file:/usr/lib/ambari-server/ambari-server-2.1.0.1470.jar_ambari-server_url=jdbc:postgresql://localhost/ambari_user=ambari login successful
[EL Warning]: 2016-03-01 15:47:41.688--ServerSession(1327476372)--Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: org.postgresql.util.PSQLException: ERROR: column "upgrade_package" does not exist
Position: 53
Error Code: 0
Call: SELECT repo_version_id, display_name, repositories, upgrade_package, version, stack_id FROM repo_version WHERE (repo_version_id = ?)
bind => [1 parameter bound]
Query: ReadObjectQuery(name="repositoryVersion" referenceClass=RepositoryVersionEntity )
[root@bodcdevvhdp104 tmp]#
... View more
Labels:
03-01-2016
06:20 PM
why? does 2.3.4 have a better upgrade path? From the upgrade instructions I didn't see 2.3.4 as an option at the time
... View more
03-01-2016
06:13 PM
At this point, we had to basically hack our ambari database and format our hdfs to get the cluster back up.. This has got to be the WORST upgrade experience ever. 14 hours and we are just getting the 2nd name node online.
... View more
03-01-2016
12:22 AM
1 Kudo
We dont have support yet.. 😕
... View more
03-01-2016
12:21 AM
1 Kudo
This is a QA system... (semi production... our builds are breaking) We are doing this as a stepping stone to production.
... View more
03-01-2016
12:02 AM
2 Kudos
I'm attempting to upgrade my QA cluster. I followed the upgrade procedure listed in your doc. While upgrading, I was getting core-dumps in the ambari-agent.log for failure to restart services, after multiple retries, I aborted the upgrade. Since, I've not been able to upgrade to either 2.2.9.0 or 2.3.0.0 .. We appear to have a mix of hosts at both levels. At the moment, Ambari is stuck at Upgrade Aborted and I'm unable to either downgrade or upgrade to either new version.
... View more