About vaskoyordan

vaskoyordan · ‎06-04-2015

Hello , I have installed Spark 1.3.1 binary hadoop on my windows machine. I had setup appropriate paths ( see below) but when I run, it is not able to initiate context and it fails with error: Traceback (most recent call last): File "C:/Users/N600173/PycharmProjects/Test/test.py", line 30, in <module> sc = SparkContext(master="local") File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\context.py", line 107, in __init__ SparkContext._ensure_initialized(self, gateway=gateway) File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\context.py", line 221, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway() File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\java_gateway.py", line 79, in launch_gateway raise Exception("Java gateway process exited before sending the driver its port number") Exception: Java gateway process exited before sending the driver its port number What does this mean ? Note : I am able to run successfully the Spark shell. Code: os.environ['SPARK_HOME'] = "C:/spark-1.2.0-bin-hadoop2.3" os.environ['JAVA_HOME'] = "C:/jdk1.7.0_45" sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/python") sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip") sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar") sys.path.append("C:/jdk1.7.0_45/bin") from pyspark import SparkContext sc = SparkContext(master="local") <-- This fails !

vaskoyordan · ‎03-11-2015

I am trying to use certain functionality from SparkSQL ( namely “programmatically specifying a schema” as described in the Spark 1.1.0 documentation) I am getting the following error: 15/03/10 17:00:16 INFO storage.BlockManagerMaster: Updated info of block broadcast_2_piece0 15/03/10 17:00:16 INFO spark.SparkContext: Created broadcast 2 from broadcast at MSP.scala:52 Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.types.StructField.<init>(Ljava/lang/String;Lorg/apache/spark/sql/catalyst/types/DataType;Z) at MSP$$anonfun$3.apply(MSP.scala:57) The code is: val schemaString="ATR1,ATTR2,....” line 57: val schema = StructType( schemaString.split(",").map(fieldName => StructField(fieldName, StringType, true))) Why this fails ? I am using Spark 1.2

Online	Offline
Last Visited	‎12-29-2015 12:20 PM

Member Since	‎09-11-2014 08:10 AM
Last Visited	‎12-29-2015 12:20 PM
Posts	4

Cloudera Community

PySpark issue on Windows (Java gateway process ex...

SparkSQL - org.apache.spark.sql.catalyst.types.St...