Created 09-24-2021 01:06 AM
Hi All,
I am new to Cloudera Environment, and i've setting Cloudera with CDSW services but i found error when running spark in session cdsw project , here's the error :
21/09/24 07:56:47 436 ERROR SparkContext: Error initializing SparkContext. java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:892) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1405) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:271) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:691) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:57) at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:62) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:60) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:191) at org.apache.spark.SparkContext.<init>(SparkContext.scala:511) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812) at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636) at org.apache.hadoop.ipc.Client.call(Client.java:1452) ... 35 more
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:892) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1405) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:271) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:691) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:57) at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:62) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:60) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:191) at org.apache.spark.SparkContext.<init>(SparkContext.scala:511) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812) at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636) at org.apache.hadoop.ipc.Client.call(Client.java:1452) ... 35 more
Py4JJavaError Traceback (most recent call last) <ipython-input-1-8d9a1f6249ef> in <module> 1 spark = SparkSession.builder \ 2 .master("yarn") \ ----> 3 .appName("cdsw-training") \ 4 .getOrCreate() /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/session.py in getOrCreate(self) 171 for key, value in self._options.items(): 172 sparkConf.set(key, value) --> 173 sc = SparkContext.getOrCreate(sparkConf) 174 # This SparkContext may be an existing one. 175 for key, value in self._options.items(): /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.py in getOrCreate(cls, conf) 367 with SparkContext._lock: 368 if SparkContext._active_spark_context is None: --> 369 SparkContext(conf=conf or SparkConf()) 370 return SparkContext._active_spark_context 371 /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls) 134 try: 135 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer, --> 136 conf, jsc, profiler_cls) 137 except: 138 # If an error occurs, clean up in order to allow future SparkContext creation: /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.py in _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, jsc, profiler_cls) 196 197 # Create the Java SparkContext through Py4J --> 198 self._jsc = jsc or self._initialize_context(self._conf._jconf) 199 # Reset the SparkConf to the one actually used by the SparkContext in JVM. 200 self._conf = SparkConf(_jconf=self._jsc.sc().conf()) /opt/cloudera/parcels/CDH/lib/spark/python/pyspark/context.py in _initialize_context(self, jconf) 306 Initialize SparkContext in function to allow subclass specific initialization 307 """ --> 308 return self._jvm.JavaSparkContext(jconf) 309 310 @classmethod ~/.local/lib/python3.6/site-packages/py4j/java_gateway.py in __call__(self, *args) 1572 answer = self._gateway_client.send_command(command) 1573 return_value = get_return_value( -> 1574 answer, self._gateway_client, None, self._fqn) 1575 1576 for temp_arg in temp_args: ~/.local/lib/python3.6/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". --> 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort at sun.reflect.GeneratedConstructorAccessor5.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:892) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566) at org.apache.hadoop.ipc.Client.call(Client.java:1508) at org.apache.hadoop.ipc.Client.call(Client.java:1405) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) at com.sun.proxy.$Proxy15.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:271) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:431) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) at com.sun.proxy.$Proxy16.getClusterMetrics(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:691) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:170) at org.apache.spark.internal.Logging$class.logInfo(Logging.scala:57) at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:62) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:60) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:191) at org.apache.spark.SparkContext.<init>(SparkContext.scala:511) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:238) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:586) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:700) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:812) at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636) at org.apache.hadoop.ipc.Client.call(Client.java:1452) ... 35 more
Anyone can help me to solve this issue? Is there any miss configuration in my setting?
Thanks n Regards
M R M
Created 09-26-2021 09:24 PM
@RizkyMei I would suggest you to Stop CDSW and then deploy Client configuration from CM > Cluster > Action menu.
This seems Client Configuration not being pushed. After that start CDSW and try again.
Created 09-28-2021 02:52 AM
Hi @GangWar
Thanks for your suggestion.
I've try your suggestion before, still same error show up 'spark content' , any other suggestion?
Thanks n Regards
MRM