Reply
Explorer
Posts: 12
Registered: ‎10-27-2015
Accepted Solution

Spark program in eclipse

Hi All,

 

Am trying to create a simple spark program in eclipse. Unfortunately, im getting an Out of Memory Error (Exception in thread "main" java.lang.OutOfMemoryError: PermGen space)

 

Here's the configuration of my ini file


--launcher.XXMaxPermSize
256m
--launcher.defaultAction
openFile
-vmargs
-Xms512m
-Xmx1024m
-XX:+UseParallelGC
-XX:PermSize=8g
-XX:MaxPermSize=10g

 

Run configurations > Arguments :

-Xmx10g

 

Scala code :

...

val sqlContext = new HiveContext(spark)
sqlContext.sql("SELECT * from sample_csv limit 1")

...

 

Logs :

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/02/29 20:11:46 INFO SparkContext: Running Spark version 1.6.0
16/02/29 20:11:56 INFO SecurityManager: Changing view acls to: Orson
16/02/29 20:11:56 INFO SecurityManager: Changing modify acls to: Orson
16/02/29 20:11:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Orson); users with modify permissions: Set(Orson)
16/02/29 20:11:57 INFO Utils: Successfully started service 'sparkDriver' on port 57135.
16/02/29 20:11:57 INFO Slf4jLogger: Slf4jLogger started
16/02/29 20:11:58 INFO Remoting: Starting remoting
16/02/29 20:11:58 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.181.1:57148]
16/02/29 20:11:58 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 57148.
16/02/29 20:11:58 INFO SparkEnv: Registering MapOutputTracker
16/02/29 20:11:58 INFO SparkEnv: Registering BlockManagerMaster
16/02/29 20:11:58 INFO DiskBlockManager: Created local directory at C:\Users\Orson\AppData\Local\Temp\blockmgr-be56133f-c657-4146-9e19-cfae46545b70
16/02/29 20:11:58 INFO MemoryStore: MemoryStore started with capacity 6.4 GB
16/02/29 20:11:58 INFO SparkEnv: Registering OutputCommitCoordinator
16/02/29 20:11:58 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/02/29 20:11:58 INFO SparkUI: Started SparkUI at http://192.168.181.1:4040
16/02/29 20:11:58 INFO Executor: Starting executor ID driver on host localhost
16/02/29 20:11:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57155.
16/02/29 20:11:58 INFO NettyBlockTransferService: Server created on 57155
16/02/29 20:11:58 INFO BlockManagerMaster: Trying to register BlockManager
16/02/29 20:11:58 INFO BlockManagerMasterEndpoint: Registering block manager localhost:57155 with 6.4 GB RAM, BlockManagerId(driver, localhost, 57155)
16/02/29 20:11:58 INFO BlockManagerMaster: Registered BlockManager
16/02/29 20:12:00 INFO HiveContext: Initializing execution hive, version 1.2.1
16/02/29 20:12:00 INFO ClientWrapper: Inspected Hadoop version: 2.2.0
16/02/29 20:12:00 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0
16/02/29 20:12:00 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
16/02/29 20:12:00 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
16/02/29 20:12:00 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
16/02/29 20:12:00 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
16/02/29 20:12:00 INFO deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
16/02/29 20:12:00 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:00 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/02/29 20:12:00 INFO ObjectStore: ObjectStore, initialize called
16/02/29 20:12:01 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/02/29 20:12:01 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/02/29 20:12:11 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:11 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/02/29 20:12:13 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:13 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:19 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:19 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:21 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/02/29 20:12:21 INFO ObjectStore: Initialized ObjectStore
16/02/29 20:12:21 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/02/29 20:12:22 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/02/29 20:12:24 WARN : Your hostname, solvento-orson resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:c0a8:4801%42, but we couldn't find any external IP address!
16/02/29 20:12:25 INFO HiveMetaStore: Added admin role in metastore
16/02/29 20:12:25 INFO HiveMetaStore: Added public role in metastore
16/02/29 20:12:26 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/02/29 20:12:26 INFO HiveMetaStore: 0: get_all_databases
16/02/29 20:12:26 INFO audit: ugi=Orson ip=unknown-ip-addr cmd=get_all_databases
16/02/29 20:12:26 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/02/29 20:12:26 INFO audit: ugi=Orson ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/02/29 20:12:26 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/02/29 20:12:28 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874_resources
16/02/29 20:12:28 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874
16/02/29 20:12:28 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874
16/02/29 20:12:28 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874/_tmp_space.db
16/02/29 20:12:28 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:28 INFO HiveContext: default warehouse location is /user/hive/warehouse
16/02/29 20:12:28 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/02/29 20:12:28 INFO ClientWrapper: Inspected Hadoop version: 2.2.0
16/02/29 20:12:28 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0
16/02/29 20:12:29 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
16/02/29 20:12:29 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
16/02/29 20:12:29 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
16/02/29 20:12:29 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
16/02/29 20:12:29 INFO deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
16/02/29 20:12:29 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:29 INFO metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083
16/02/29 20:12:29 INFO metastore: Connected to metastore.
16/02/29 20:12:29 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/ccfc9462-2c5a-49ce-a811-503694353c1a_resources
16/02/29 20:12:30 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a
16/02/29 20:12:30 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a
16/02/29 20:12:30 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a/_tmp_space.db
16/02/29 20:12:30 INFO ParseDriver: Parsing command: SELECT * from sample_csv limit 1
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.<init>(HiveParser_IdentifiersParser.java:12377)
at org.apache.hadoop.hive.ql.parse.HiveParser.<init>(HiveParser.java:706)
at org.apache.hadoop.hive.ql.parse.HiveParser.<init>(HiveParser.java:700)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:195)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.spark.sql.hive.HiveQl$.getAst(HiveQl.scala:276)
at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:303)
at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:197)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:217)
16/02/29 20:12:32 INFO SparkContext: Invoking stop() from shutdown hook
16/02/29 20:12:32 INFO SparkUI: Stopped Spark web UI at http://192.168.181.1:4040
16/02/29 20:12:32 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/02/29 20:12:32 INFO MemoryStore: MemoryStore cleared
16/02/29 20:12:32 INFO BlockManager: BlockManager stopped
16/02/29 20:12:32 INFO BlockManagerMaster: BlockManagerMaster stopped
16/02/29 20:12:32 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/02/29 20:12:32 INFO SparkContext: Successfully stopped SparkContext
16/02/29 20:12:32 INFO ShutdownHookManager: Shutdown hook called
16/02/29 20:12:32 INFO ShutdownHookManager: Deleting directory C:\Users\Orson\AppData\Local\Temp\spark-7efe7a3c-e47c-41a0-8e94-a1fd19ca7197
16/02/29 20:12:32 INFO ShutdownHookManager: Deleting directory C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/02/29 20:12:32 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
java.io.IOException: Failed to delete: C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:928)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at scala.util.Try$.apply(Try.scala:191)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

 

 

Thanks!

Cloudera Employee
Posts: 366
Registered: ‎07-29-2013

Re: Spark program in eclipse

The error points to the problem -- you have perhaps plenty of memory
but not enough permgen space in the JVM. Try something like
-XX:MaxPermSize=2g in your JVM options to executors
Explorer
Posts: 12
Registered: ‎10-27-2015

Re: Spark program in eclipse

Thanks srowen!
uzi
Explorer
Posts: 13
Registered: ‎05-27-2016

Re: Spark program in eclipse

Hello guys,

 

When we build a spark application, we usually export it as a jar and run it on the cluster. Is there a way we can run the application on the cluster directly from eclipse (with some setting)? This would be very effecient to test/debug. So just wondering if there is anything out there.

 

Thanks

Cloudera Employee
Posts: 30
Registered: ‎04-05-2016

Re: Spark program in eclipse

You don't need to export as a JAR for unit testing. You can do :

 

SparkConf().setMaster(local[2]) and run the program as usual Java application in IDE.  Also make sure that you have all the dependent libraries in the classpath.

uzi
Explorer
Posts: 13
Registered: ‎05-27-2016

Re: Spark program in eclipse

[ Edited ]

When we are using hive context (hive tables) or phoenix tables with in our spark application it is very difficult ( as a matter of fact i think it is impossible with out going through point less installation in the local machine) to run the application locally through eclipse.

Anyways, I was looking for something like this

 

http://www.dbengineering.info/2016/09/debug-spark-application-running-on-cloud.html

 

where we are able to run it on debug mode. Anyways, For the moment I am happy with this. Just sharing incase if someone else has the same question I had few months ago.

 

Thanks

 

Highlighted
New Contributor
Posts: 1
Registered: ‎05-01-2017

Re: Spark program in eclipse

Hi,

 

i am facing the below mentioned issue. Please help me to solve it

 

17/05/02 11:07:13 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b
java.io.IOException: Failed to delete: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

Announcements