Member since
03-19-2016
69
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2923 | 01-04-2017 07:30 PM | |
5239 | 12-20-2016 02:30 AM | |
1313 | 12-17-2016 06:13 PM |
05-23-2016
10:28 PM
It seems the user info had been deleted from the proxyuser property. Not sure how it was working before upgrade. After adding the user back that list it is working fine.
... View more
05-23-2016
10:26 PM
Thanks It worked. It was on our dev cluster and got into problem while upgrading to HDP 2.4 due to some manual error.
... View more
05-20-2016
11:22 PM
1 Kudo
Both Namenode are crashed (Active & Standby). I restarted the Active and it is serving. But we are unable to restart the standby NN. I tried to manually restart it but still it is failed. How do I recover and restart the standby Namenode. Version: HDP 2.2 2016-05-20 18:53:57,954 INFO namenode.EditLogInputStream (RedundantEditLogInputStream.java:nextOp(176)) - Fast-forwarding stream 'http://usw2stdpma01.glassdoor.local:8480/getJournal?jid=dfs-nameservices&segmentTxId=14726901&storageInfo=-60%3A761966699%3A0%3ACID-d16e0895-7c12-404e-9223-952d1b19ace0' to transaction ID 13013207
2016-05-20 18:53:58,216 WARN namenode.FSNamesystem (FSNamesystem.java:loadFromDisk(750)) - Encountered exception loading fsimage
java.io.IOException: There appears to be a gap in the edit log. We expected txid 13013207, but got txid 14726901.
at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:212)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:140)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:684)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1032)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:748)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:538)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:597)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:764)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:748)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1441)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1507)
2016-05-20 18:53:58,322 FATAL namenode.NameNode (NameNode.java:main(1512)) - Failed to start namenode.
java.io.IOException: There appears to be a gap in the edit log. We expected txid 13013207, but got txid 14726901.
at org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:212)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:140)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:829)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:684)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:281)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1032)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:748)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:538)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:597)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:764)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:748)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1441)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1507)
2016-05-20 18:53:58,324 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2016-05-20 18:53:58,325 INFO namenode.NameNode (StringUtils.java:run(659)) - SHUTDOWN_MSG
... View more
Labels:
- Labels:
-
Apache Hadoop
05-20-2016
03:10 PM
Hive impersonation is already exists. After the HDP upgrade , it is not working. The problem exists only in hive-server2. If I run the query using hive-cli, then it is impersonating the user and it is running as expected.
... View more
05-20-2016
06:14 AM
Hive -version : hive-1.2.1000.2.4.0.0
I upgraded our test cluster with hive-1.2.1000.2.4.0.0. After the upgrade, I am unable to impersonate any user when running hive query using HS2.
Following property were set.
hive.server2.enable.doAs=true.
[hive@usw2dydpmn01 hive]$ beeline
WARNING: Use "yarn jar" to launch YARN applications.
Beeline version 1.2.1000.2.4.0.0-169 by Apache Hive
beeline> !connect jdbc:hive2://usw2dydpmn01:10010
Connecting to jdbc:hive2://usw2dydpmn01:10010
Enter username for jdbc:hive2://usw2dydpmn01:10010: hive
Enter password for jdbc:hive2://usw2dydpmn01:10010:
Error: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive is not allowed to impersonate hive (state=,code=0)
Edit 1: updating with more info. Hive impersonation was already exists. After the HDP upgrade , it is not working.
The problem exists only in hive-server2. If I run the query using hive-cli, then it is impersonating the user and it is running as expected. If I make this property to false, then all the queries are running. hive.server2.enable.doAs=true.
... View more
Labels:
05-11-2016
06:00 PM
I am trying to upgrade the HDP 2.2 to HDP 2.4. I am trying to find how do I rollback in case we hit the problem and unable to proceed. In the rolling upgrade wizard, there is option to downgrade. Do we have to take care anything manually ?
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
05-07-2016
07:51 PM
1 Kudo
Thanks. It is working.
... View more
05-07-2016
06:46 AM
Ambari metrics collector was working fine for 6 months and suddenly it stopped working. This is the error we are getting.
hbase.rootdir - /mnt/data/ambari-metrics-collector/hbase hbase.cluster.distributed -false Metrics service operation mode -embedded hbase.zookeeper.property.clientPort - 61181 06:19:30,678 WARN [main] RecoverableZooKeeper:253 - Possibly transient ZooKeeper, quorum=localhost:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connect
ionLoss for /hbase
06:19:30,678 ERROR [main] RecoverableZooKeeper:255 - ZooKeeper exists failed after 4 attempts
06:19:30,679 WARN [main] ZKUtil:484 - hconnection-0xd78795, quorum=localhost:61181, baseZNode=/hbase Unable to set watcher on znode (/hbase)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:199)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(ConnectionManager.java:874)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.access$600(ConnectionManager.java:585)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1553)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1599)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1653)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1860)
at org.apache.hadoop.hbase.client.HBaseAdmin$MasterCallable.prepare(HBaseAdmin.java:3363)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:125)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3390)
at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:408)
at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:429)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:762)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1107)
at org.apache.phoenix.query.DelegateConnectionQueryServices.createTable(DelegateConnectionQueryServices.java:110)
at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1527)
at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:535)
at org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:184)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:260)
at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:252)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:250)
at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1026)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$9.call(ConnectionQueryServicesImpl.java:1532)
... View more
Labels:
- Labels:
-
Apache Ambari
05-05-2016
12:33 AM
Today , we have seen the RM crashed and threw the following error message. There are bunch of JIRA tickets related to that error . One of my job is killed but the application is running in orphaned mode. The app_id is displaying in RM-UI. I am unable to kill that App_id using yarn -application <app_id> . I restarted the RM and ZK but unable to remove that from displaying in RM -UI. It is not consuming any resources. How do I remove it from displaying ? t: maxCompletedAppsInMemory = 10000, removing app application_1452798563961_0971 from memory:
2016-05-04 19:00:30,449 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1193)) - Null container completed...
2016-05-04 19:00:30,568 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1193)) - Null container completed...
2016-05-04 19:00:31,251 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1193)) - Null container completed...
2016-05-04 19:00:32,252 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1193)) - Null container completed...
2016-05-04 19:00:45,325 FATAL resourcemanager.ResourceManager (ResourceManager.java:handle(753)) - Received a org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent of type STATE_STORE_OP_FAILED. Cause:
java.io.IOException: Wait for ZKClient creation timed out
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1073)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1097)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doMultiWithRetries(ZKRMStateStore.java:934)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.storeRMDelegationTokenAndSequenceNumberState(ZKRMStateStore.java:734)
at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeRMDelegationTokenAndSequenceNumber(RMStateStore.java:650)
at org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:112)
at org.apache.hadoop.yarn.server.resourcemanager.security.RMDelegationTokenSecretManager.storeNewToken(RMDelegationTokenSecretManager.java:49)
at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.storeToken(AbstractDelegationTokenSecretManager.java:272)
at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:391)
at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:47)
at org.apache.hadoop.security.token.Token.<init>(Token.java:59)
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:907)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:291)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN
-
Security
05-03-2016
10:09 PM
Is it possible to upgrade all components like hive, mr,tez, spark and just leave out Kafka. The reason is kafka is running in 0.8.1 version and upgrading to 0.9 - consumer jobs will get impacted. After 0.8.2 onwards the moved away from ZK dependencies to broker and we are not sure how much time it will take to re-write the kafka consumer jobs.
... View more
Labels:
- « Previous
- Next »