Created on 09-25-2015 08:42 AM - edited 09-16-2022 02:41 AM
Hi every one ,
I have simple spark application in which I have few spring context and rule xml file . all these files are part of the projects and located under resource folder (reource\db\rule\rule2.xml) and its working fine in spark local mode . when I ran the same application in yarn cluster mode , its complaining that the file rule2.xml not found and its part of Maven built jar. do I need to make any changes for the application to work in cluster mode ? any help would be appreciated
Here is the code in which I am reading the xml file
JaxbUtils.unmarshalRule(
ByteStreams.toByteArray(
Resources.getResource(String.format("db/rule/rule2.xml", id)).openStream()));
Here is the error log
/24 15:57:07 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/09/24 15:57:07 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@bdaolc011node08.sabre.com:40589/user/HeartbeatReceiver
15/09/24 15:57:09 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/09/24 15:57:09 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(3132) called with curMem=0, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.1 KB, free 530.0 MB)
15/09/24 15:57:09 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 134 ms
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(6144) called with curMem=3132, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.0 KB, free 530.0 MB)
15/09/24 15:57:12 INFO support.ClassPathXmlApplicationContext: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3c6db742: startup date [Thu Sep 24 15:57:12 CDT 2015]; root of context hierarchy
15/09/24 15:57:12 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/rules-engine-spring.xml]
15/09/24 15:57:13 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/ere-spring.xml]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'nativeRuleBuilder': replacing [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'rulesExecutor': replacing [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-test.properties]
15/09/24 15:57:13 WARN support.PropertySourcesPlaceholderConfigurer: Could not load properties from class path resource [spring/ere-test.properties]: class path resource [spring/ere-test.properties] cannot be opened because it does not exist
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-spring.properties]
15/09/24 15:57:13 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
15/09/24 15:57:13 INFO jdbc.JDBCRDD: closed connection
java.lang.IllegalArgumentException: resource spring/rule2.xml not found.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at com.google.common.io.Resources.getResource(Resources.java:152)
at com.sabre.rules.AppRuleExecutor.rule(AppRuleExecutor.java:50)
at com.sabre.rules.AppRuleExecutor.executeRules(AppRuleExecutor.java:39)
at com.sabre.rules.RuleComponent.executeRules(RuleComponent.java:43)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:60)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:37)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Created 09-25-2015 09:47 AM
Created 09-25-2015 01:34 PM
Thank you for your response. I used --files and still getting the same error. I copied the rule2.xml to the folder from where I am running Maven built jar but still my application code is pointing to /db/rule/rule2.xml . I ran the spark application as below
spark-submit --jars vertica-jdbc-7.1.1-3.jar --files rule2.xml#rule2.xml --class "com.sabre.rules.SMAAlertImpl" --master yarn-cluster simple-project-1.0-shaded.jar
Running the Maven built jar from /home/../SMA/ and this folder has following files
simple-project-1.0-shaded.jar
rule2.xml
vertica-jdbc-7.1.1-3.jar
Here is my application code
JaxbUtils.unmarshalRule(
ByteStreams.toByteArray(
Resources.getResource(String.format("/db/rule/rule%d.xml", id)).openStream())
Do I need to change my application code ? why am I still getting the error ? let me know if I miss anything ?
Created 09-28-2015 08:58 AM
any update to my issue ? I am kind of stuck . it might be simple fix but since I am very new to Spark .. dont know how to fix this issue . Please let me know
I even copied the file to hsfs location /db/rule/rule2.xml and tried the below but didnt work. rule2.xml is part of the project and with the Maven built jar its available . to make it work do I need to change my application code between local mode and cluster mode ?
JaxbUtils.unmarshalRule(
ByteStreams.toByteArray(
Resources.getResource(String.format("/db/rule/rule%d.xml", id)).openStream())
JaxbUtils.unmarshalRule(
ByteStreams.toByteArray(
Resources.getResource(String.format("file:///db/rule/rule%d.xml", id)).openStream())
JaxbUtils.unmarshalRule(
ByteStreams.toByteArray(
Resources.getResource(String.format("hdfs:///db/rule/rule%d.xml", id)).openStream())
Created 09-28-2015 09:07 AM
Created 09-28-2015 09:35 AM
would you mind telling me what needs to be changed in my code ? as I mentioned local mode works file when I tried to read file like this
file:///db/rule/rule2.xml
What am i supposed to change in the above code to make it work in cluster mode.
Thanks in advance .
Created 09-28-2015 08:48 PM
With the --files option you put the file in your working directory on the executor.
You are trying to point to the file using an absolute path which is not what files option does for you. Can you use just the name "rule2.xml" and not a path.
When you read the documentation for the files. See the important note at the bottom of the page running on yarn.
Also do not use the Resources.getResource() but just use a open of a java construct like: new FileInputStream("rule2.xml") or something like it.
Wilfred