Member since
04-12-2016
5
Posts
5
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2919 | 09-16-2016 09:51 AM |
09-16-2016
09:51 AM
1 Kudo
I found the issue and how to solve it. here is my explaination: ambari-server starts using "ambari" user and the password found in the configuration file /etc/ambari-server/conf/password.dat By default, the password is "bigdata". As I updated, via psql, the PostgreSQL ambari password, the password in the configration file was incorrect. Si i changed the content of the configuration file with the value of my new password and the server restarted well.
... View more
09-16-2016
08:56 AM
I'm not using the sandbox. I'm working on a cluster with HDP 2.4. This procedure updates the password for the "admin" user of Ambari. But the error seems to concern the "ambari" user.
... View more
09-16-2016
08:20 AM
1 Kudo
Hello, I work on a cluster with HDP 2.4. I'm not able to start ambari. I use this command: [ambari@ current]$ service ambari-server restart And I get this error: ERROR: Exiting with exit code -1.
REASON: Ambari Server java process died with exitcode 1. Check /var/log/ambari-server/ambari-server.out for more information In the ambari-server.out and ambari-server.log files i get the same error: 15 Sep 2016 10:42:28,893 ERROR [main] DBAccessorImpl:102 - Error while creating database accessor
org.postgresql.util.PSQLException: FATAL: password authentication failed for user "ambari" So, i guess the ambari-server start command try to connect to the PostgreSQL ambari database, but can't access it because he don't use the correct password. How can i know which password is used by this command ? What can I do to solve this issue and restart ambari? Thanks! Edit: The error appeared after i change the password of the user "ambari" on the PostgreSQL database.
... View more
Labels:
- Labels:
-
Apache Ambari
06-10-2016
07:25 AM
I have 3 Journal Noeds in my cluster, but they don't seem to fail.
... View more
06-08-2016
02:50 PM
3 Kudos
Hello,
I have a HA cluster (HDP-2.4.0.0-169) in dev environment, with 4 nodes virtualized with vShpere.
1 Master (Active Namenode, Active Ressource Manager)
1 Master/Slave (Standby Namenode, Standby Ressource Manager, Data Node, Node Manager, Journal Node)
2 Slaves (Data Node, Node Manager, Journal Node)
The Name Nodes of my cluster shuts down regularly, even when there is no work on it.
Name Node log: 2016-06-02 02:23:18,682 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.49:8485, 192.168.1.47:8485, 192.168.1.46:8485], stream=QuorumOutputStream starting at txid 2549780))
java.io.IOException: Timed out waiting 60000ms for a quorum of nodes to respond.
It seems to come from the Journal Nodes but i can't find a way to fix the issue. Any idea where to look at?
Thanks!
Here are the Name Node detailed logs when it shutdown: 2016-06-02 02:23:12,610 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2016-06-02 02:23:12,719 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 54038 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:13,720 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 55039 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:14,721 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 56040 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:15,611 INFO BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) - chooseUnderReplicatedBlocks selected Total=0 Reset bookmarks? true
2016-06-02 02:23:15,611 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2016-06-02 02:23:15,722 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 57041 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:16,723 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 58043 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:17,725 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(134)) - Waited 59044 ms (timeout=60000 ms) for a response for sendEdits. No responses yet.
2016-06-02 02:23:18,611 INFO BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) - chooseUnderReplicatedBlocks selected Total=0 Reset bookmarks? true
2016-06-02 02:23:18,611 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2016-06-02 02:23:18,682 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.49:8485, 192.168.1.47:8485, 192.168.1.46:8485], stream=QuorumOutputStream starting at txid 2549780))
java.io.IOException: Timed out waiting 60000ms for a quorum of nodes to respond.
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:647)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2470)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2335)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:688)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:397)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)
2016-06-02 02:23:18,682 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(72)) - Aborting QuorumOutputStream starting at txid 2549780
2016-06-02 02:23:21,611 INFO BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) - chooseUnderReplicatedBlocks selected Total=0 Reset bookmarks? true
2016-06-02 02:23:21,611 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2016-06-02 02:23:24,612 INFO BlockStateChange (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) - chooseUnderReplicatedBlocks selected Total=0 Reset bookmarks? true
2016-06-02 02:23:24,612 INFO BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK* neededReplications = 0, pendingReplications = 0.
2016-06-02 02:23:24,625 WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(406)) - Took 65944ms to send a batch of 1 edits (205 bytes) to remote journal 192.168.1.47:8485
2016-06-02 02:23:24,773 WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(406)) - Took 66092ms to send a batch of 1 edits (205 bytes) to remote journal 192.168.1.49:8485
2016-06-02 02:23:24,898 WARN client.QuorumJournalManager (IPCLoggerChannel.java:call(406)) - Took 66216ms to send a batch of 1 edits (205 bytes) to remote journal 192.168.1.46:8485
2016-06-02 02:23:25,201 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2016-06-02 02:23:25,258 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at d1hdpmaster01/192.168.1.45
************************************************************/
2016-06-02 02:23:25,635 WARN timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:putMetrics(212)) - Unable to send metrics to collector by address:http://d1hdpmaster01:6188/ws/v1/timeline/metric
... View more
Labels:
- Labels:
-
Apache Hadoop