Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark program in eclipse

Solved Go to solution

Spark program in eclipse

Explorer

Hi All,

 

Am trying to create a simple spark program in eclipse. Unfortunately, im getting an Out of Memory Error (Exception in thread "main" java.lang.OutOfMemoryError: PermGen space)

 

Here's the configuration of my ini file


--launcher.XXMaxPermSize
256m
--launcher.defaultAction
openFile
-vmargs
-Xms512m
-Xmx1024m
-XX:+UseParallelGC
-XX:PermSize=8g
-XX:MaxPermSize=10g

 

Run configurations > Arguments :

-Xmx10g

 

Scala code :

...

val sqlContext = new HiveContext(spark)
sqlContext.sql("SELECT * from sample_csv limit 1")

...

 

Logs :

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/02/29 20:11:46 INFO SparkContext: Running Spark version 1.6.0
16/02/29 20:11:56 INFO SecurityManager: Changing view acls to: Orson
16/02/29 20:11:56 INFO SecurityManager: Changing modify acls to: Orson
16/02/29 20:11:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Orson); users with modify permissions: Set(Orson)
16/02/29 20:11:57 INFO Utils: Successfully started service 'sparkDriver' on port 57135.
16/02/29 20:11:57 INFO Slf4jLogger: Slf4jLogger started
16/02/29 20:11:58 INFO Remoting: Starting remoting
16/02/29 20:11:58 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.181.1:57148]
16/02/29 20:11:58 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 57148.
16/02/29 20:11:58 INFO SparkEnv: Registering MapOutputTracker
16/02/29 20:11:58 INFO SparkEnv: Registering BlockManagerMaster
16/02/29 20:11:58 INFO DiskBlockManager: Created local directory at C:\Users\Orson\AppData\Local\Temp\blockmgr-be56133f-c657-4146-9e19-cfae46545b70
16/02/29 20:11:58 INFO MemoryStore: MemoryStore started with capacity 6.4 GB
16/02/29 20:11:58 INFO SparkEnv: Registering OutputCommitCoordinator
16/02/29 20:11:58 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/02/29 20:11:58 INFO SparkUI: Started SparkUI at http://192.168.181.1:4040
16/02/29 20:11:58 INFO Executor: Starting executor ID driver on host localhost
16/02/29 20:11:58 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57155.
16/02/29 20:11:58 INFO NettyBlockTransferService: Server created on 57155
16/02/29 20:11:58 INFO BlockManagerMaster: Trying to register BlockManager
16/02/29 20:11:58 INFO BlockManagerMasterEndpoint: Registering block manager localhost:57155 with 6.4 GB RAM, BlockManagerId(driver, localhost, 57155)
16/02/29 20:11:58 INFO BlockManagerMaster: Registered BlockManager
16/02/29 20:12:00 INFO HiveContext: Initializing execution hive, version 1.2.1
16/02/29 20:12:00 INFO ClientWrapper: Inspected Hadoop version: 2.2.0
16/02/29 20:12:00 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0
16/02/29 20:12:00 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
16/02/29 20:12:00 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
16/02/29 20:12:00 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
16/02/29 20:12:00 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
16/02/29 20:12:00 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
16/02/29 20:12:00 INFO deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
16/02/29 20:12:00 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:00 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/02/29 20:12:00 INFO ObjectStore: ObjectStore, initialize called
16/02/29 20:12:01 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/02/29 20:12:01 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/02/29 20:12:11 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:11 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/02/29 20:12:13 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:13 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:19 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:19 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:21 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
16/02/29 20:12:21 INFO ObjectStore: Initialized ObjectStore
16/02/29 20:12:21 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/02/29 20:12:22 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/02/29 20:12:24 WARN : Your hostname, solvento-orson resolves to a loopback/non-reachable address: fe80:0:0:0:0:5efe:c0a8:4801%42, but we couldn't find any external IP address!
16/02/29 20:12:25 INFO HiveMetaStore: Added admin role in metastore
16/02/29 20:12:25 INFO HiveMetaStore: Added public role in metastore
16/02/29 20:12:26 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/02/29 20:12:26 INFO HiveMetaStore: 0: get_all_databases
16/02/29 20:12:26 INFO audit: ugi=Orson ip=unknown-ip-addr cmd=get_all_databases
16/02/29 20:12:26 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/02/29 20:12:26 INFO audit: ugi=Orson ip=unknown-ip-addr cmd=get_functions: db=default pat=*
16/02/29 20:12:26 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/02/29 20:12:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/02/29 20:12:28 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874_resources
16/02/29 20:12:28 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874
16/02/29 20:12:28 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874
16/02/29 20:12:28 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/0c1b1e0d-5e6c-47b8-a8d5-5398e262c874/_tmp_space.db
16/02/29 20:12:28 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:28 INFO HiveContext: default warehouse location is /user/hive/warehouse
16/02/29 20:12:28 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/02/29 20:12:28 INFO ClientWrapper: Inspected Hadoop version: 2.2.0
16/02/29 20:12:28 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0
16/02/29 20:12:29 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
16/02/29 20:12:29 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
16/02/29 20:12:29 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
16/02/29 20:12:29 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
16/02/29 20:12:29 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
16/02/29 20:12:29 INFO deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
16/02/29 20:12:29 WARN HiveConf: HiveConf of name hive.enable.spark.execution.engine does not exist
16/02/29 20:12:29 INFO metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083
16/02/29 20:12:29 INFO metastore: Connected to metastore.
16/02/29 20:12:29 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/ccfc9462-2c5a-49ce-a811-503694353c1a_resources
16/02/29 20:12:30 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a
16/02/29 20:12:30 INFO SessionState: Created local directory: C:/Users/Orson/AppData/Local/Temp/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a
16/02/29 20:12:30 INFO SessionState: Created HDFS directory: /tmp/hive/Orson/ccfc9462-2c5a-49ce-a811-503694353c1a/_tmp_space.db
16/02/29 20:12:30 INFO ParseDriver: Parsing command: SELECT * from sample_csv limit 1
Exception in thread "main" java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.<init>(HiveParser_IdentifiersParser.java:12377)
at org.apache.hadoop.hive.ql.parse.HiveParser.<init>(HiveParser.java:706)
at org.apache.hadoop.hive.ql.parse.HiveParser.<init>(HiveParser.java:700)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:195)
at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.spark.sql.hive.HiveQl$.getAst(HiveQl.scala:276)
at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:303)
at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:237)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:217)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1$$anonfun$apply$2.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Failure.append(Parsers.scala:197)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$Parser$$anonfun$append$1.apply(Parsers.scala:249)
at scala.util.parsing.combinator.Parsers$$anon$3.apply(Parsers.scala:217)
16/02/29 20:12:32 INFO SparkContext: Invoking stop() from shutdown hook
16/02/29 20:12:32 INFO SparkUI: Stopped Spark web UI at http://192.168.181.1:4040
16/02/29 20:12:32 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/02/29 20:12:32 INFO MemoryStore: MemoryStore cleared
16/02/29 20:12:32 INFO BlockManager: BlockManager stopped
16/02/29 20:12:32 INFO BlockManagerMaster: BlockManagerMaster stopped
16/02/29 20:12:32 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/02/29 20:12:32 INFO SparkContext: Successfully stopped SparkContext
16/02/29 20:12:32 INFO ShutdownHookManager: Shutdown hook called
16/02/29 20:12:32 INFO ShutdownHookManager: Deleting directory C:\Users\Orson\AppData\Local\Temp\spark-7efe7a3c-e47c-41a0-8e94-a1fd19ca7197
16/02/29 20:12:32 INFO ShutdownHookManager: Deleting directory C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/02/29 20:12:32 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/02/29 20:12:32 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
java.io.IOException: Failed to delete: C:\Users\Orson\AppData\Local\Temp\spark-b20f169a-0a3b-426e-985e-6641b3be3fd6
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:928)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239)
at scala.util.Try$.apply(Try.scala:191)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

 

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark program in eclipse

Master Collaborator
The error points to the problem -- you have perhaps plenty of memory
but not enough permgen space in the JVM. Try something like
-XX:MaxPermSize=2g in your JVM options to executors
6 REPLIES 6

Re: Spark program in eclipse

Master Collaborator
The error points to the problem -- you have perhaps plenty of memory
but not enough permgen space in the JVM. Try something like
-XX:MaxPermSize=2g in your JVM options to executors

Re: Spark program in eclipse

Explorer
Thanks srowen!

Re: Spark program in eclipse

Explorer

Hello guys,

 

When we build a spark application, we usually export it as a jar and run it on the cluster. Is there a way we can run the application on the cluster directly from eclipse (with some setting)? This would be very effecient to test/debug. So just wondering if there is anything out there.

 

Thanks

Re: Spark program in eclipse

Contributor

You don't need to export as a JAR for unit testing. You can do :

 

SparkConf().setMaster(local[2]) and run the program as usual Java application in IDE.  Also make sure that you have all the dependent libraries in the classpath.

Re: Spark program in eclipse

Explorer

When we are using hive context (hive tables) or phoenix tables with in our spark application it is very difficult ( as a matter of fact i think it is impossible with out going through point less installation in the local machine) to run the application locally through eclipse.

Anyways, I was looking for something like this

 

http://www.dbengineering.info/2016/09/debug-spark-application-running-on-cloud.html

 

where we are able to run it on debug mode. Anyways, For the moment I am happy with this. Just sharing incase if someone else has the same question I had few months ago.

 

Thanks

 

Re: Spark program in eclipse

New Contributor

Hi,

 

i am facing the below mentioned issue. Please help me to solve it

 

17/05/02 11:07:13 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b
java.io.IOException: Failed to delete: C:\Users\arpitbh\AppData\Local\Temp\spark-07d9637a-2eb8-4a32-8490-01e106a80d6b
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1010)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

Don't have an account?
Coming from Hortonworks? Activate your account here