Support Questions

deniska · ‎09-02-2015

After i execute a pyspark command on a master node of the cloudera manager 5.4 i get a set of info messages, which are then followed by the endless constantly updated list of info messages like the following:

15/09/02 14:45:17 INFO Client: Application report for application_1441188100451_0007 (state: ACCEPTED)
15/09/02 14:45:18 INFO Client: Application report for application_1441188100451_0007 (state: ACCEPTED)

and that is it - i do not get a chance to start typing the commands because those messages wouldn't end.

Any ideas??

The whole log trace is presented below

[hdfs@master root]$ pyspark
Python 2.7.6 (default, Sep 2 2015, 13:59:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/jars/avro-tools-1.7.6-cdh5.4.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/09/02 14:29:19 INFO SparkContext: Running Spark version 1.3.0
15/09/02 14:29:21 INFO SecurityManager: Changing view acls to: hdfs
15/09/02 14:29:21 INFO SecurityManager: Changing modify acls to: hdfs
15/09/02 14:29:21 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs); users with modify permissions: Set(hdfs)
15/09/02 14:29:21 INFO Slf4jLogger: Slf4jLogger started
15/09/02 14:29:21 INFO Remoting: Starting remoting
15/09/02 14:29:21 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@master:50842]
15/09/02 14:29:21 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@master:50842]
15/09/02 14:29:21 INFO Utils: Successfully started service 'sparkDriver' on port 50842.
15/09/02 14:29:21 INFO SparkEnv: Registering MapOutputTracker
15/09/02 14:29:21 INFO SparkEnv: Registering BlockManagerMaster
15/09/02 14:29:21 INFO DiskBlockManager: Created local directory at /tmp/spark-24485b7c-7ee6-4804-bb61-af5da89bd246/blockmgr-74d2149a-9435-4c74-923e-4e700fb0d01e
15/09/02 14:29:21 INFO MemoryStore: MemoryStore started with capacity 267.3 MB
15/09/02 14:29:22 INFO HttpFileServer: HTTP File server directory is /tmp/spark-431f4eb5-4621-4fc0-a903-ddb27fbb1bde/httpd-1e91e8fc-3738-427d-ad2c-d76a1228ca91
15/09/02 14:29:22 INFO HttpServer: Starting HTTP Server
15/09/02 14:29:22 INFO Server: jetty-8.y.z-SNAPSHOT
15/09/02 14:29:22 INFO AbstractConnector: Started SocketConnector@0.0.0.0:56900
15/09/02 14:29:22 INFO Utils: Successfully started service 'HTTP file server' on port 56900.
15/09/02 14:29:22 INFO SparkEnv: Registering OutputCommitCoordinator
15/09/02 14:29:23 INFO Server: jetty-8.y.z-SNAPSHOT
15/09/02 14:29:23 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/09/02 14:29:23 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/09/02 14:29:23 INFO SparkUI: Started SparkUI at http://master:4040
15/09/02 14:29:24 INFO RMProxy: Connecting to ResourceManager at master/192.168.153.132:8032
15/09/02 14:29:25 INFO Client: Requesting a new application from cluster with 1 NodeManagers
15/09/02 14:29:25 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (1471 MB per container)
15/09/02 14:29:25 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
15/09/02 14:29:25 INFO Client: Setting up container launch context for our AM
15/09/02 14:29:25 INFO Client: Preparing resources for our AM container
15/09/02 14:29:26 INFO Client: Setting up the launch environment for our AM container
15/09/02 14:29:27 INFO SecurityManager: Changing view acls to: hdfs
15/09/02 14:29:27 INFO SecurityManager: Changing modify acls to: hdfs
15/09/02 14:29:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs); users with modify permissions: Set(hdfs)
15/09/02 14:29:27 INFO Client: Submitting application 6 to ResourceManager
15/09/02 14:29:27 INFO YarnClientImpl: Submitted application application_1441188100451_0006
15/09/02 14:29:28 INFO Client: Application report for application_1441188100451_0006 (state: ACCEPTED)

15/09/02 14:40:59 INFO Client:
client token: N/A
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.hdfs
start time: 1441194058739
final status: UNDEFINED
tracking URL: http://master:8088/proxy/application_1441188100451_0007/
user: hdfs
15/09/02 14:41:00 INFO Client: Application report for application_1441188100451_0007 (state: ACCEPTED)
15/09/02 14:41:01 INFO Client: Application report for application_1441188100451_0007 (state: ACCEPTED)

srowen · ‎09-02-2015

That generally means it's still waiting for YARN to allocate an executor, and that in turn usually means you don't have enough resources free in YARN to satisfy the request. Check your number and size of executors vs available resources and max size of any one container that your YARN config allows.

deniska · ‎09-02-2015

Sowen, thanks for the quick reply. Any chance you could specify how that can be done in practice/more specific practical terms? For i am a coplete novice to the whole Cloudera Manager and YARN universe?

srowen · ‎09-02-2015

How many executors are you requesting, and with how much memory / how many cores? those are command line options to spark-submit or spark-shell.

In your YARN Resource Manager, have a look at how much memory/cores you have access to and how much is used. This is in the UI.

Look at your YARN config too in the settings regarding Resource Management, and the 'container max' settings controlling how many cores and how much memory YARN is willing to give to any one container.

These together will help figure out if indeed there's just a mismatch between how much you're asking for in Spark and how much you have made available.

deniska · ‎09-02-2015

Ok, thanks! I'll try fiddling with all those an let you know how it worked out. Thanks!

skehlet · ‎07-17-2016

For future googlers, I found I had some applications still running in the background. I could see them with:

yarn application -list

From there I could kill them off with:

yarn application -kill <id>

Once for each running app, I had two. For example:

yarn application -kill application_1468772861928_0014

Then I was able to get back to work.

SparkLearner1 · ‎03-31-2019

Hi Are you able to resolve this issue? I'm facing the same issue. (I use single machine to setup cloudera virtual box for learning purpose)

0xLSA · ‎12-12-2022

For me it worked using --master local

$ pyspark --master local

Cloudera Community

Support Questions

Endless INFO Client: Application report for application_xx (state: ACCEPTED) messages