Member since
02-11-2016
11
Posts
2
Kudos Received
0
Solutions
06-29-2017
06:19 PM
@Kshitij Badani The cluster is completely free and there are no other jobs running. we have about 1TB memory, 256Vcores, 8 data nodes. livy server log has below contents: Predominant one is "ERROR RSCClient: Failed to connect to context" Please let us know you thoughts. 17/06/29 14:02:11 INFO InteractiveSession$: Creating LivyClient for sessionId: 135
17/06/29 14:02:11 WARN RSCConf: Your hostname, ip-10-228-2-223.ec2.internal, resolves to a loopback address, but we couldn't find any external IP address!
17/06/29 14:02:11 WARN RSCConf: Set livy.rsc.rpc.server.address if you need to bind to another address.
17/06/29 14:02:11 INFO InteractiveSessionManager: Registering new session 135
17/06/29 14:02:12 INFO ContextLauncher: 17/06/29 14:02:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/06/29 14:02:12 INFO ContextLauncher: Exception in thread "main" java.lang.IllegalArgument
Exception: For input string: "yes"
17/06/29 14:02:12 INFO ContextLauncher: at scala.collection.immutable.StringLike$class.parseBoolean(StringLike.scala:238)
17/06/29 14:02:12 INFO ContextLauncher: at scala.collection.immutable.StringLike$class.toBoolean(StringLike.scala:226)
17/06/29 14:02:12 INFO ContextLauncher: at scala.collection.immutable.StringOps.toBoolean(StringOps.scala:31)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.SparkConf$$anonfun$getBoolean$2.apply(SparkConf.scala:337)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.SparkConf$$anonfun$getBoolean$2.apply(SparkConf.scala:337)
17/06/29 14:02:12 INFO ContextLauncher: at scala.Option.map(Option.scala:145)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.SparkConf.getBoolean(SparkConf.scala:337)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.util.Utils$.isDynamicAllocationEnabled(Utils.scala:2283)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.yarn.ClientArguments.<init>(ClientArguments.scala:56)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1185)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.yarn.Client.main(Client.scala)
17/06/29 14:02:12 INFO ContextLauncher: at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
17/06/29 14:02:12 INFO ContextLauncher: at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
17/06/29 14:02:12 INFO ContextLauncher: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
17/06/29 14:02:12 INFO ContextLauncher: at java.lang.reflect.Method.invoke(Method.java:498)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:745)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:163)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$$anon$1.run(SparkSubmit.scala:161)
17/06/29 14:02:12 INFO ContextLauncher: at java.security.AccessController.doPrivileged(Native Method)
17/06/29 14:02:12 INFO ContextLauncher: at javax.security.auth.Subject.doAs(Subject.java:422)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:161)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
17/06/29 14:02:12 INFO ContextLauncher: at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/06/29 14:02:12 WARN ContextLauncher: Child process exited with code 1.
17/06/29 14:02:12 ERROR RSCClient: Failed to connect to context.
java.io.IOException: Child process exited with code 1.
at com.cloudera.livy.rsc.ContextLauncher$ChildProcess$1.run(ContextLauncher.java:416)
at com.cloudera.livy.rsc.ContextLauncher$ChildProcess$2.run(ContextLauncher.java:490)
at java.lang.Thread.run(Thread.java:745)
17/06/29 14:02:12 INFO RSCClient: Failing pending job 2889f619-dc65-4364-9203-c1caf581ea7e due to shutdown.
17/06/29 14:02:12 INFO InteractiveSession: Failed to ping RSC driver for session 135. Killing application.
17/06/29 14:02:12 INFO InteractiveSession: Stopping InteractiveSession 135...
17/06/29 14:03:11 ERROR SparkYarnApp: Error whiling refreshing YARN state: java.lang.Exception: No YARN application is found with tag livy-session-135-nl80pq2a in 60 seconds. Please check your cluster status, it is may be very busy.
17/06/29 14:03:11 INFO InteractiveSession: Stopped InteractiveSession 135.
17/06/29 14:03:11 WARN InteractiveSession: (Fail to get rsc uri,java.util.concurrent.ExecutionException: java.io.IOException: Child process exited with code 1.)
... View more
06-29-2017
07:26 AM
@Kshitij Badani I work with Jayadeep, we have made sure of all the points that you mentioned. For point 5) we are able to use livy commands from command prompt. output: * upload completely sent off: 81 out of 81 bytes < HTTP/1.1 201 Created < Date: Thu, 29 Jun 2017 07:10:37 GMT < WWW-Authenticate: Negotiate YGoGCSqGSIb3EgECAgIAb1swWaADAgEFoQMCAQ+iTTBLoAMCARKiRARCwjfJg+Z8lYE1nmmiIPQB0gb3flO96lTm/elABws1vT02CKl+KcHkCHUObklGVgZwebtCN73AhZSQy60+d2LnYdWG < Set-Cookie: hadoop.auth="u=talend&p=talend@TRANSPORTATION-HDPDEV.GE.COM&t=kerberos&e=1498756237672&s=Kkj7P3Ig2g06wogRIzZQimhX1gQ="; HttpOnly < Content-Type: application/json; charset=UTF-8 < Location: /batches/34 < Content-Length: 100 < Server: Jetty(9.2.16.v20160414) < * Closing connection 0 {"id":34,"state":"starting","appId":null,"appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":[]}[talend@ip-10-235-3-142 ~]$ curl -u: --negotiate -H "X-Requested-By: user" http://10.228.3.142:9889/batches/34 {"id":34,"state":"dead","appId":"application_1498720050151_0001","appInfo":{"driverLogUrl":"http://ip-10-235-0-154.ec2.internal:8188/applicationhistory/logs/ip-10-228-1-148.ec2.internal:45454/container_e82_1498720050151_0001_02_000001/container_e82_1498720050151_0001_02_000001/talend","sparkUiUrl":"http://ip-10-235-2-223.ec2.internal:8088/proxy/application_1498720050151_0001/"},"log":["\t ApplicationMaster RPC port: -1","\t queue: default","\t start time: 1498720241709","\t final status: UNDEFINED","\t tracking URL: http://10.228.3.142:9889/batches/34 user: talend","17/06/29 03:10:41 INFO ShutdownHookManager: Shutdown hook called","17/06/29 03:10:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-c3c670df-280a-46e0-82fd-7ecc4efc5ef2","YARN Diagnostics:","User application exited with status 1"]}
The problem is only from livy zeppelin. The commands that we are trying in livy is as below: %livy.pyspark print ("Hello") The log that we are getting is as below: org.apache.zeppelin.livy.LivyException: Session 24 is finished, appId: null, log: [java.lang.Exception: No YARN application is found with tag livy-session-24-mbc0jh8y in 60 seconds. Please check your cluster status, it is may be very busy., com.cloudera.livy.utils.SparkYarnApp.com$cloudera$livy$utils$SparkYarnApp$getAppIdFromTag(SparkYarnApp.scala:182) com.cloudera.livy.utils.SparkYarnApp$anonfun$1$anonfun$4.apply(SparkYarnApp.scala:248) com.cloudera.livy.utils.SparkYarnApp$anonfun$1$anonfun$4.apply(SparkYarnApp.scala:245) scala.Option.getOrElse(Option.scala:120) com.cloudera.livy.utils.SparkYarnApp$anonfun$1.apply$mcV$sp(SparkYarnApp.scala:245) com.cloudera.livy.Utils$anon$1.run(Utils.scala:95)] at org.apache.zeppelin.livy.BaseLivyInterprereter.createSession(BaseLivyInterprereter.java:221) at org.apache.zeppelin.livy.BaseLivyInterprereter.initLivySession(BaseLivyInterprereter.java:110) at org.apache.zeppelin.livy.BaseLivyInterprereter.open(BaseLivyInterprereter.java:92) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Can you please help us to get this working.
... View more
06-01-2017
02:10 PM
After manking the changes, I had to restart Ambari-server and agents for the cache to get updated.
... View more
05-18-2017
01:29 PM
The gateway.log looks like this: 2017-05-18 09:27:40,061 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2017-05-18 09:27:40,065 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2017-05-18 09:27:40,088 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2017-05-18 09:27:40,096 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2017-05-18 09:27:40,100 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2017-05-18 09:27:40,102 INFO hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
... View more
05-18-2017
12:01 PM
@Sandeep More please let me know if you need further information.
... View more
05-18-2017
10:11 AM
Knox Logs looks like this: For Jobhistory: 17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI||||access|uri|/gateway/default/jobhistory/joblogs/ip-10-228-1-83.ec2.internal:45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa|unavailable|Request method: GET
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||authentication|uri|/gateway/default/jobhistory/joblogs/ip-10-228-1-83.ec2.internal:45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa|success|
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||authentication|uri|/gateway/default/jobhistory/joblogs/ip-10-228-1-83.ec2.internal:45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa|success|Groups: []
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||authorization|uri|/gateway/default/jobhistory/joblogs/ip-10-228-1-83.ec2.internal:45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa|success|
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||dispatch|uri|http://ip-10-228-3-43.ec2.internal:19888/jobhistory/logs/ip-10-228-1-83.ec2.internal%3A45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa/?user.name=guest|unavailable|Request method: GET
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||dispatch|uri|http://ip-10-228-3-43.ec2.internal:19888/jobhistory/logs/ip-10-228-1-83.ec2.internal%3A45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa/?user.name=guest|success|Response status: 200
17/05/18 04:27:56 ||84a60c35-a082-4c94-82df-12e26cdea8bc|audit|JOBHISTORYUI|guest|||access|uri|/gateway/default/jobhistory/joblogs/ip-10-228-1-83.ec2.internal:45454/container_e19_1495094385014_0002_01_000002/attempt_1495094385014_0002_m_000000_0/ambari-qa|success|Response status: 200 Yarn: 17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI||||access|uri|/gateway/default/yarn/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr|unavailable|Request method: GET
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||authentication|uri|/gateway/default/yarn/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr|success|
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||authentication|uri|/gateway/default/yarn/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr|success|Groups: []
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||authorization|uri|/gateway/default/yarn/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr|success|
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||dispatch|uri|http://10.228.2.223:8088/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr?user.name=guest|unavailable|Request method: GET
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||dispatch|uri|http://10.228.2.223:8088/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr?user.name=guest|success|Response status: 404
17/05/18 06:05:53 ||488b9b3c-885d-4b79-a035-385ccded234d|audit|YARNUI|guest|||access|uri|/gateway/default/yarn/node/containerlogs/container_e19_1495094385014_0004_01_000690/hive/stderr|success|Response status: 404 yarn browser image as below: Jobhistory UI output as below:
... View more
05-17-2017
06:06 PM
I am using knox 0.12.0 and configured topology for YARNUI to be displayed by knox URL. The initial webpage loads and yarn applications are visible, but when I click on the logs link there is an error like "Problem accessing /node/containerlogs/container_e17_1495012138865_0004_01_000001/hive/stderr. Reason:"Not Found. how to display the yarn logs on the browser using knox. Similar is the case with jobhistory server, error is "Cannot get container logs. Invalid nodeId: datahost.ec2.internal%3A45454".
... View more
Labels:
- Labels:
-
Apache Knox
-
Apache YARN
03-13-2017
01:16 PM
1 Kudo
Hello, I followed the steps given in the link: https://community.hortonworks.com/content/supportkb/49488/how-to-change-yarn-quick-links-url-in-ambari.html. but still the URL's point to old URL's (not knox ones) Can you please give me the steps to configure the quicklinks in Ambari to point to knox URL's (for Jobhistory, sparkhistory, hdfs, oozie etc). Thanks, Sanjeev
... View more
Labels:
- Labels:
-
Apache Knox