Created on 11-12-2018 03:39 AM - edited 09-16-2022 06:53 AM
Hi,
I have run into the below error while launching pyspark. Would appreciate any help on this.
-sh-4.1$ pyspark
Python 2.6.6 (r266:84292, Aug 18 2016, 08:36:59)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
18/11/12 11:20:25 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1537380011359_1185346 to YARN : Application rejected by queue placement policy
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:257)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:148)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:157)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:542)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:744)
18/11/12 11:20:26 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
18/11/12 11:20:26 ERROR util.Utils: Uncaught exception in thread Thread-2
java.lang.NullPointerException
at org.apache.spark.network.shuffle.ExternalShuffleClient.close(ExternalShuffleClient.java:152)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1264)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:96)
at org.apache.spark.SparkContext$$anonfun$stop$12.apply$mcV$sp(SparkContext.scala:1768)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1230)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1767)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:614)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:744)
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/pyspark/shell.py", line 43, in <module>
sc = SparkContext(pyFiles=add_files)
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/pyspark/context.py", line 115, in __init__
conf, jsc, profiler_cls)
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/pyspark/context.py", line 172, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/pyspark/context.py", line 235, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/opt/cloudera/parcels/CDH-5.9.1-1.cdh5.9.1.p2671.2910/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1537380011359_1185346 to YARN : Application rejected by queue placement policy
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:257)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:148)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:157)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:542)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:214)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:744)
Created 11-27-2018 05:03 AM
This line in the console output indicates the job was rejected by YARN:
18/11/12 11:20:25 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1537380011359_1185346 to YARN : Application rejected by queue placement policy
YARN determines which resource pool to place jobs in via placement rules in the scheduler configuration (see https://blog.cloudera.com/blog/2016/06/untangling-apache-hadoop-yarn-part-4-fair-scheduler-queue-bas...).
It might be worth checking whether you have a "<rule name="reject"/>" rule in your scheduler configuration and if so, why the user's pyspark job reached this rule instead of one of the previous rules to place the job in the appropriate queue.