Member since
02-24-2015
27
Posts
5
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
9718 | 05-13-2015 08:21 PM | |
23812 | 03-25-2015 09:12 PM |
10-08-2015
10:57 AM
We were able to get it up. Thank you very much.
... View more
10-07-2015
12:35 PM
We have 3 node Zookeeper quorum, and on of the node was accidently terminated on AWS. Our cluster is on CDH5.3.3, and it has these services: HDFS, Yarn, HBase, Oozie, Zookeeper. We like to add the node back to the quorum. Beside the node is part of the zookeeper quorum, it also had these roles as well: Hbase Master (fail over successfully), Yarn Resource Manger (HA - fail over successfully), Journalnode (HA), and Oozie. Is anyone know how to do it? If you can please provide the steps. Thanks,
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Zookeeper
-
HDFS
07-06-2015
11:00 AM
Thank you very much for your response. I checked both resource manager node and the value for yarn.resourcemanager.webapp.address property was set. I still could not get it working.
... View more
06-02-2015
02:32 PM
Hi Linou, Were you able to resolve your issues again? I still could not solve it yet. Thanks
... View more
06-01-2015
09:37 AM
Hi Wilfred, We are currently on 5.3.3 now, and we are still having that issues now. Also, I did try yarn command with -appOwner option but it still returned the same message. I ran the command as yarn user. Thanks
... View more
05-28-2015
02:56 PM
Hi Wilfred, I really don't want to disable the ACL. I just tried to disable it to see it help to resolve the issue. I prefer to get the application log using the UI instead of login to the host to retrieve the logs because not everyone has access to the host. By the way, I just tried to retrive the log based on your suggestion by issue the command below, but it still did not work. yarn logs -applicationId application_1432831904896_0149 I got this message: 15/05/28 14:51:17 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm54 Application has not completed. Logs are only available after an application completes
... View more
05-21-2015
09:36 AM
Hi Wilfred, Have you had a chance to take a look with my update? Thanks
... View more
05-18-2015
09:04 AM
There were not much error from the resource manager log. Whenever I click on the child url, there there message like this in the RM log. I ran the job with non oozie user, but it still happened if I ran it with oozie user. The link try to access: http://i-802bd856.prod-dis11.aws1:8088/proxy/application_1431873888015_3748 Here is the partial log. 2015-05-18 08:41:37,237 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://i-922ad944.prod-dis11.aws1:33802/ws/v1/mapreduce/jobs/job_1431873888015_3748 which is the app master GUI of application_1431873888015_3748 owned by inventory 2015-05-18 08:41:37,237 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://i-9f2ad949.prod-dis11.aws1:47581/ws/v1/mapreduce/jobs/job_1431873888015_3758 which is the app master GUI of application_1431873888015_3758 owned by oozie 2015-05-18 08:41:43,844 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1431873888015_3758_000001 with final state: FINISHING, and exit status: -1000 2015-05-18 08:41:43,846 INFO org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet: dr.who is accessing unchecked http://i-922ad944.prod-dis11.aws1:33802/ which is the app master GUI of application_1431873888015_3748 owned by inventory 2015-05-18 08:41:43,846 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1431873888015_3758_000001 State change from RUNNING to FINAL_SAVING 2015-05-18 08:41:43,846 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1431873888015_3758 with final state: FINISHING 2015-05-18 08:41:43,848 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore: Watcher event type: NodeDataChanged with state:SyncConnected for path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1431873888015_3758/appattempt_1431873888015_3758_000001 for Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED I found some other people who had kind of similar message from log about (dr. who) and they were able to resolve by playing with hadoop.http.staticuser.user property or disable ACL. I tried to disabled by setting yarn.acl.enable = false, but I did not help.
... View more
05-13-2015
08:21 PM
I was able to fix the issue. We use chef to setup hbase configuration on worker node, but there were problem with chef setting which caused the missing hbase connection setting on the worker node. After I fix chef, the hbase connection was setup fine. Thanks
... View more
05-13-2015
08:16 PM
Hi Wilfred, Yes, it is logged in the RM, and I do have HA setup for RM. I do have HA setup for HDFS as well. The job did show on the RM. Below is a screen shot of RM UI page which listed all running job. When I clicked on the ApplicationMaster link for each job under Tracking UI column, the errors page was showned instead of the actual status page. I also got the same errors if I click on search icon for Child Job 1: under the Child JOb Urls on Oozie UI. Here is a screen shot of the oozie page. Here is the screen shoot for the error page: I did could get to the status page find after the job completed, but not while it is still running. Thank you very for your help.
... View more