Member since
01-29-2018
20
Posts
3
Kudos Received
0
Solutions
04-12-2019
02:04 AM
Dear Romainr: I follow your advise to modify the srcCode and made the hue3.9 notebook worked on my CDH5.7.1 Kerberized cluster with livy0.5.0.Now i can using spark-shell on notebook and its run well. but on yarn i saw the spark job user was always livy(i set livy be a haddop.proxyuser), seems its bind on the user which the keytab of livy-server-luancher. so i can't control the notebook authority by hue. i can see the hue set proxyuser on a the token, but the spark job's user was not be reset. livy-conf livy.impersonation.enabled=true
livy.repl.enable-hive-context=true
livy.spark.deploy-mode=client
livy.spark.master=yarn
livy.superusers=hue
livy.server.auth.type=kerberos
livy.server.auth.kerberos.keytab=/etc/security/keytabs/spnego.keytab
livy.server.auth.kerberos.principal=HTTP/xxx.com@xxx.COM
livy.server.launch.kerberos.keytab=/etc/security/keytabs/livy.keytab
livy.server.launch.kerberos.principal=livy/xxx.com@xxx.COM livy-log 19/04/12 14:48:14 INFO InteractiveSession$: Creating Interactive session 3: [owner: hue, request: [kind: pyspark, proxyUser: Some(baoyong), heartbeatTimeoutInSecond: 0]]
19/04/12 14:48:25 INFO LineBufferedStream: stdout: client token: Token { kind: YARN_CLIENT_TOKEN, service: }
19/04/12 14:48:25 INFO LineBufferedStream: stdout: diagnostics: N/A
19/04/12 14:48:25 INFO LineBufferedStream: stdout: ApplicationMaster host: 192.168.103.166
19/04/12 14:48:25 INFO LineBufferedStream: stdout: ApplicationMaster RPC port: 0
19/04/12 14:48:25 INFO LineBufferedStream: stdout: queue: root.livy
19/04/12 14:48:25 INFO LineBufferedStream: stdout: start time: 1555051701595
19/04/12 14:48:25 INFO LineBufferedStream: stdout: final status: UNDEFINED
19/04/12 14:48:25 INFO LineBufferedStream: stdout: tracking URL: http://bigdata166.xxx.com:8088/proxy/application_1555044848792_0063/
19/04/12 14:48:25 INFO LineBufferedStream: stdout: user: livy
19/04/12 14:48:28 INFO InteractiveSession: Interactive session 3 created [appid: application_1555044848792_0063, owner: hue, proxyUser: None, state: idle, kind: pyspark, info: {driverLogUrl=null, sparkUiUrl=null}]
... View more
01-05-2019
09:44 AM
Hi @lwang thank you very much for your kind feedback, I'm preparing the CDH and CM upgrade to 5.13.1, in order to avoid this issue. Thanks again! Regards, Alex
... View more
12-07-2018
08:35 PM
Hi Alex, Look for the Logaggregation related messages in the Node manger log file on one of the node where one of the container was running for the application: In normal case you should see: 2018-12-07 20:27:59,994 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping application application_1544179594403_0020 ... org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1544179594403_0020 .. org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_e06_1544179594403_0020_01_000001. Current good log dirs are /yarn/container-logs Do you see these messages for the failing application or do you see some error/exception instead? If you can paste the relevant log for the failing application I can take a look. Regards Bimal
... View more
05-07-2018
10:46 AM
This seems to be indicating that the jute.maxbuffer has been exceeded. You can increase this on the command line side by exporting the following: export ZKCLI_JVM_FLAGS=-Djute.maxbuffer=4194304 You may need to confirm on the ZK service configuration, if the jute.maxbuffer size is also 4 MB -pd
... View more
03-31-2018
08:02 AM
ok @pdvorak, just like I thought, thanks a lot for the confirmation! 😉 Kind regards! Alex
... View more
02-02-2018
04:45 AM
Hi all, in order to illustrate to you a full overview about how change the all logs files related to Cloudera Manager Agents (v 5.12), I want to share with you guys my particular situation and how I solved it. I wanted modify the default path logs for the following log files links with Cloudera Manager Agent: - supervisord.out - supervisord.log (these are the supervisor's logs, this service start up with the Cloudera Manager Agent service the first time when the server start up or when we start up manually the cloudera-scm-agent service. If you stop the Cloudera Manager Agent service but the server remain up, this service remain up also). - cmf_listener.log (this is a cmf_listener service's logs, this service start up with the Cloudera Manager Agent service the first time when the server start up or when we start up manually the cloudera-scm-agent service. If you stop the Cloudera Manager Agent service but the server remain up, this service remain up also (managed by supervisord). - cloudera-scm-agent.log - cloudera-scm-agent.out (These are the cloudera-scm-agent logs…) Below the files that I've modified in order to set a new path for these logs with a dedicated file-system: /usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.2-py2.7.egg/lib/cmf/agent.py (set a new path for the logs: supervisord.out; cmf_listener.log and supervisord.log. Furthermore, check if the parameter for cloudera libraries agent is present in order to avoid unexpected errors with the start up of supervisord (default_lib_dir = '/var/lib/cloudera-scm-agent')) /usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.12.2-py2.7.egg/cmf/agent.py (set a new path for the logs: supervisord.out; cmf_listener.log and supervisord.log. Furthermore, check if the parameter for cloudera libraries agent is present in order to avoid unexpected errors with the start up of supervisord (default_lib_dir = '/var/lib/cloudera-scm-agent')) /etc/default/cloudera-scm-agent(set a new path for the cloudera agent logs as arguments: CMF_AGENT_ARGS="--logdir=/your/custom/cloudera-scm/user/writable/directory/" and set also CMF_AGENT_ARGS="--lib_dir=/var/lib/cloudera-scm-agent", in order to avoid unexpected errors with the start up of cloudera-scm-agent (As suggested by Harsh 😉). /etc/cloudera-scm-agent/config.ini (set a new path for the cloudera agent logs: log_file=/your/custom/cloudera-scm/user/writable/directory/ and set also lib_dir=/var/lib/cloudera-scm-agent, in order to avoid unexpected errors with the start up of cloudera-scm-agent) /etc/init.d/cloudera-scm-agent (set a new path for cloudera-scm-agent.out modifying the following parameter: AGENT_OUT=${CMF_VAR:-/var}/log/cloudera/$prog/$prog.out). In my case I left the logs into /var/log but I mounted a dedicated file-system on the folder /var/log/cloudera/ for all these logs. If you want to stop all the processes related to Cloudera Manager Agent, you should perform the following steps: service cloudera-scm-agent stop (Stop Cloudera Manager Agent process) ps -eaf | grep cmf (Show the supervisord parent process) root 77977 1 0 13:09 ? 00:00:00 /usr/lib64/cmf/agent/build/env/bin/python /usr/lib64/cmf/agent/build/env/bin/supervisord root 77983 77977 0 13:09 ? 00:00:00 python2.7 /usr/lib64/cmf/agent/build/env/bin/cmf-listener -l /var/log/cloudera/cloudera-scm-agent/cmf_listener.log /run/cloudera-scm-agent/events kill -15 77977 (Stop supervisord and cmf_listener processes). I used kill -15 because I didn't found another way to stop these processes… I would like so much if anyone from Cloudera's Engineers can confirm/amend the procedure that I just described above... Cheers! Alex
... View more