<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128545#M91242</link>
    <description>&lt;P&gt;This might help: &lt;A href="https://community.hortonworks.com/questions/30288/oozie-spark-action-on-hdp-24-nosuchmethoderror-org.html" target="_blank"&gt;https://community.hortonworks.com/questions/30288/oozie-spark-action-on-hdp-24-nosuchmethoderror-org.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 15 Jul 2016 18:46:36 GMT</pubDate>
    <dc:creator>bwalter1</dc:creator>
    <dc:date>2016-07-15T18:46:36Z</dc:date>
    <item>
      <title>Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128540#M91237</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I We have installed  HDP-2.4.0.0. As per the requirement i need to configure oozie job w.r.t spark action.&lt;/P&gt;&lt;P&gt;I have written the code.&lt;/P&gt;&lt;P&gt;Workflow.xml:&lt;/P&gt;&lt;PRE&gt;&amp;lt;?xml version="1.0"?&amp;gt;
&amp;lt;workflow-app name="${OOZIE_WF_NAME}" xmlns="uri:oozie:workflow:0.5"&amp;gt;
&amp;lt;global&amp;gt;
        &amp;lt;configuration&amp;gt;
            &amp;lt;property&amp;gt;
                &amp;lt;name&amp;gt;oozie.launcher.yarn.app.mapreduce.am.env&amp;lt;/name&amp;gt;
                &amp;lt;value&amp;gt;SPARK_HOME=/usr/hdp/2.4.0.0-169/spark/&amp;lt;/value&amp;gt;
            &amp;lt;/property&amp;gt;
        &amp;lt;/configuration&amp;gt;
&amp;lt;/global&amp;gt;
    &amp;lt;start to="spark-mongo-ETL"/&amp;gt;
    &amp;lt;action name="spark-mongo-ETL"&amp;gt;
        &amp;lt;spark xmlns="uri:oozie:spark-action:0.1"&amp;gt;
            &amp;lt;job-tracker&amp;gt;${jobTracker}&amp;lt;/job-tracker&amp;gt;
            &amp;lt;name-node&amp;gt;${nameNode}&amp;lt;/name-node&amp;gt;
             &amp;lt;master&amp;gt;yarn-cluster&amp;lt;/master&amp;gt;
            &amp;lt;mode&amp;gt;cluster&amp;lt;/mode&amp;gt;
            &amp;lt;name&amp;gt;SparkMongoLoading&amp;lt;/name&amp;gt;
            &amp;lt;class&amp;gt;com.SparkSqlExample&amp;lt;/class&amp;gt;
            &amp;lt;jar&amp;gt;${nameNode}${WORKFLOW_HOME}/lib/SparkParquetExample-0.0.1-SNAPSHOT.jar&amp;lt;/jar&amp;gt;
        &amp;lt;/spark&amp;gt;
        &amp;lt;ok to="End"/&amp;gt;
        &amp;lt;error to="killAction"/&amp;gt;
    &amp;lt;/action&amp;gt;
        &amp;lt;kill name="killAction"&amp;gt;
        &amp;lt;message&amp;gt;Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]&amp;lt;/message&amp;gt;
    &amp;lt;/kill&amp;gt;
    &amp;lt;end name="End"/&amp;gt;
&amp;lt;/workflow-app&amp;gt;
&lt;/PRE&gt;&lt;P&gt;Job.properties:&lt;/P&gt;&lt;PRE&gt;nameNode=hdfs://nameNode1:8020
jobTracker=yarnNM:8050
queueName=default
user.name=hadoop
oozie.libpath=/user/oozie/share/lib/
oozie.use.system.libpath=true
WORKFLOW_HOME=/user/hadoop/SparkETL
OOZIE_WF_NAME=Spark-Mongo-ETL-wf
SPARK_MONGO_JAR=${nameNode}${WORKFLOW_HOME}/lib/SparkParquetExample-0.0.1-SNAPSHOT.jar
oozie.wf.application.path=${nameNode}/user/hadoop/SparkETL/
&lt;/PRE&gt;&lt;P&gt;Under lib folder 2 jar are placed&lt;/P&gt;&lt;PRE&gt;SparkParquetExample-0.0.1-SNAPSHOT.jar
spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar&lt;/PRE&gt;&lt;P&gt;When I submit the oozie job, the action was killed.&lt;/P&gt;&lt;P&gt;Error :&lt;/P&gt;&lt;PRE&gt;Error: java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation
  at org.apache.hadoop.fs.FileSystem.getScheme(FileSystem.java:217)
  at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2624)
  at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2634)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
  at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
  at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:342)
  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.confChanged(LocalDirAllocator.java:270)
  at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:432)
  at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:164)
  at org.apache.hadoop.mapred.YarnChild.configureLocalDirs(YarnChild.java:256)
  at org.apache.hadoop.mapred.YarnChild.configureTask(YarnChild.java:314)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:146)
&lt;/PRE&gt;&lt;P&gt;Also let me know how to pass the jars and files explicitly in the workflow.&lt;/P&gt;&lt;P&gt;Command :&lt;/P&gt;&lt;PRE&gt;spark-submit --class com.SparkSqlExample --master yarn-cluster --num-executors 2 --driver-memory 1g --executor-memory 2g --executor-cores 2 --files /usr/hdp/current/spark-client/conf/hive-site.xml --jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar,/usr/hdp/current/spark-client/lib/jackson-core-2.4.4.jar,/usr/hdp/current/spark-client/lib/mongo-hadoop-spark-1.5.2.jar,/usr/share/java/slf4j-simple-1.7.5.jar,/usr/hdp/current/spark-client/lib/spark-core_2.10-1.6.0.jar,/usr/hdp/current/spark-client/lib/spark-hive_2.10-1.6.0.jar,/usr/hdp/current/spark-client/lib/spark-sql_2.10-1.6.0.jar,/usr/hdp/current/spark-client/lib/mongo-hadoop-core-1.5.2.jar,/usr/hdp/current/spark-client/lib/spark-avro_2.10-2.0.1.jar,/usr/hdp/current/spark-client/lib/spark-csv_2.10-1.4.0.jar,/usr/hdp/current/spark-client/lib/spark-mongodb_2.10-0.11.2.jar,/usr/hdp/current/spark-client/lib/spark-streaming_2.10-1.6.0.jar,/usr/hdp/current/spark-client/lib/commons-csv-1.1.jar,/usr/hdp/current/spark-client/lib/mongodb-driver-3.2.2.jar,/usr/hdp/current/spark-client/lib/mongo-hadoop-master-1.5.2.jar,/usr/hdp/current/spark-client/lib/mongo-java-driver-3.2.2.jar,/usr/hdp/current/spark-client/lib/spark-1.6.0.2.4.0.0-169-yarn-shuffle.jar --conf spark.yarn.jar=hdfs:///user/spark/spark-assembly-1.6.0.2.4.0.0-169-hadoop2.7.1.2.4.0.0-169.jar --conf spark.yarn.executor.memoryOverhead=512 /home/hadoop/SparkParquetExample-0.0.1-SNAPSHOT.jar&lt;/PRE&gt;&lt;P&gt;The above command executes successfully&lt;/P&gt;&lt;P&gt;Can anyone suggest me the solution. &lt;/P&gt;</description>
      <pubDate>Thu, 14 Jul 2016 10:47:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128540#M91237</guid>
      <dc:creator>vijaykumar243</dc:creator>
      <dc:date>2016-07-14T10:47:07Z</dc:date>
    </item>
    <item>
      <title>Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128541#M91238</link>
      <description>&lt;P&gt;
	I don't know where the TFS bit comes from, maybe some dependency problems.&lt;/P&gt;&lt;P&gt;
	For including all dependencies in the workflow I would recommend to go for a fat jar (assembly). In scala with sbt you can see the idea here &lt;A href="https://community.hortonworks.com/content/kbentry/43886/creating-fat-jars-for-spark-kafka-streaming-using.html"&gt;Creating fat jars with sbt&lt;/A&gt;. Same works with maven's "maven-assembly-plugin". You should be able to call your code as&lt;/P&gt;
&lt;PRE&gt;spark-submit --master yarn-cluster \ 
--num-executors 2 --driver-memory 1g --executor-memory 2g --executor-cores 2 \
--class com.SparkSqlExample \
/home/hadoop/SparkParquetExample-0.0.1-SNAPSHOT-with-depencencies.jar
&lt;/PRE&gt;&lt;P&gt;If this works, the jar with dependencies should be the one in the oozie spark action.&lt;/P&gt;</description>
      <pubDate>Thu, 14 Jul 2016 14:21:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128541#M91238</guid>
      <dc:creator>bwalter1</dc:creator>
      <dc:date>2016-07-14T14:21:14Z</dc:date>
    </item>
    <item>
      <title>Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128542#M91239</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/452/bwalter.html" nodeid="452"&gt;@Bernhard Walter&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Thanks for the reply!!!.&lt;/P&gt;&lt;P&gt;I have followed your idea, but still throwing different error.&lt;/P&gt;&lt;P&gt;Please help me.&lt;/P&gt;&lt;PRE&gt;diagnostics: Application application_1468279065782_0300 failed 2 times due to AM Container for appattempt_1468279065782_0300_000002 exited with  exitCode: -1000
  For more detailed output, check application tracking page:http://yarnNM:8088/cluster/app/application_1468279065782_0300Then, click on links to logs of each attempt.
  Diagnostics: Permission denied: user=hadoop, access=EXECUTE, inode="/user/yarn/.sparkStaging/application_1468279065782_0300/__spark_conf__1316069581048982381.zip":yarn:yarn:drwx------
  at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
  at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:259)
  at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:205)
  at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
  at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1771)
  at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:3866)
  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1076)
  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:843)
  at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)
  
&lt;/PRE&gt;</description>
      <pubDate>Fri, 15 Jul 2016 01:15:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128542#M91239</guid>
      <dc:creator>vijaykumar243</dc:creator>
      <dc:date>2016-07-15T01:15:16Z</dc:date>
    </item>
    <item>
      <title>Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128543#M91240</link>
      <description>&lt;P&gt;Hi @&lt;A rel="user" href="https://community.cloudera.com/users/452/bwalter.html" nodeid="452"&gt;Bernhard Walter&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Inspite of Creating the Fat jar, the below error also occured &lt;/P&gt;&lt;PRE&gt;Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, org.apache.spark.util.Utils$.DEFAULT_DRIVER_MEM_MB()I
java.lang.NoSuchMethodError: org.apache.spark.util.Utils$.DEFAULT_DRIVER_MEM_MB()I
	at org.apache.spark.deploy.yarn.ClientArguments.&amp;lt;init&amp;gt;(ClientArguments.scala:49)
	at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1120)
	at org.apache.spark.deploy.yarn.Client.main(Client.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)&lt;/PRE&gt;</description>
      <pubDate>Fri, 15 Jul 2016 04:09:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128543#M91240</guid>
      <dc:creator>vijaykumar243</dc:creator>
      <dc:date>2016-07-15T04:09:18Z</dc:date>
    </item>
    <item>
      <title>Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128544#M91241</link>
      <description>&lt;P&gt;It looks like you are executing the job as user hadoop, However spark wants to execute staging data from/user/yarn (which can only be accessed by yarn). How did you start the job and with which user?&lt;/P&gt;&lt;P&gt;I am surprised that spark uses /user/yarn as staging dir for user hadoop. Is there any staging dir configuration in your system (SPARK_YARN_STAGING_DIR)?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jul 2016 18:44:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128544#M91241</guid>
      <dc:creator>bwalter1</dc:creator>
      <dc:date>2016-07-15T18:44:45Z</dc:date>
    </item>
    <item>
      <title>Re: Executing Spark  action in Oozie using yarn cluster mode but getting an error java.lang.UnsupportedOperationException: Not implemented by the TFS FileSystem implementation</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128545#M91242</link>
      <description>&lt;P&gt;This might help: &lt;A href="https://community.hortonworks.com/questions/30288/oozie-spark-action-on-hdp-24-nosuchmethoderror-org.html" target="_blank"&gt;https://community.hortonworks.com/questions/30288/oozie-spark-action-on-hdp-24-nosuchmethoderror-org.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Jul 2016 18:46:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Executing-Spark-action-in-Oozie-using-yarn-cluster-mode-but/m-p/128545#M91242</guid>
      <dc:creator>bwalter1</dc:creator>
      <dc:date>2016-07-15T18:46:36Z</dc:date>
    </item>
  </channel>
</rss>

