<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question org.apache.commons.vfs.FileNotFoundException spoon map reduce failure on cdh5 centos in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/19736#M3181</link>
    <description>&lt;P&gt;Im trying to run a simple map/reduce job on spoon 5.1 against a centos 6 cdh 5 cluster. My map jobs are failing with the following error.&lt;BR /&gt;I assume it is memory based, I just wondered whether anyone else had encountered this error ?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #008000;"&gt;org.apache.commons.vfs.FileNotFoundException: Could not read from&lt;BR /&gt;"file:///yarn/nm/usercache/mikejf12/appcache/application_1412471201309_0002/container_1412471201309_0002_01_000002/job.jar"&lt;BR /&gt;because it is a not a file.&lt;BR /&gt;at org.apache.commons.vfs.provider.AbstractFileObject.getInputStream(Unknown Source)&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 09:09:09 GMT</pubDate>
    <dc:creator>mikejf</dc:creator>
    <dc:date>2022-09-16T09:09:09Z</dc:date>
    <item>
      <title>org.apache.commons.vfs.FileNotFoundException spoon map reduce failure on cdh5 centos</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/19736#M3181</link>
      <description>&lt;P&gt;Im trying to run a simple map/reduce job on spoon 5.1 against a centos 6 cdh 5 cluster. My map jobs are failing with the following error.&lt;BR /&gt;I assume it is memory based, I just wondered whether anyone else had encountered this error ?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #008000;"&gt;org.apache.commons.vfs.FileNotFoundException: Could not read from&lt;BR /&gt;"file:///yarn/nm/usercache/mikejf12/appcache/application_1412471201309_0002/container_1412471201309_0002_01_000002/job.jar"&lt;BR /&gt;because it is a not a file.&lt;BR /&gt;at org.apache.commons.vfs.provider.AbstractFileObject.getInputStream(Unknown Source)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:09:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/19736#M3181</guid>
      <dc:creator>mikejf</dc:creator>
      <dc:date>2022-09-16T09:09:09Z</dc:date>
    </item>
    <item>
      <title>Re: org.apache.commons.vfs.FileNotFoundException spoon map reduce failure on cdh5 centos</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/19790#M3182</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In&amp;nbsp; case any kettle / spoon users look this error up,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;it was caused by an incorrect data type being set on a map reduce mapper output value.&lt;BR /&gt;&lt;BR /&gt;Was not clear from the hadoop based error message !!&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2014 06:12:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/19790#M3182</guid>
      <dc:creator>mikejf</dc:creator>
      <dc:date>2014-10-07T06:12:35Z</dc:date>
    </item>
    <item>
      <title>Re: org.apache.commons.vfs.FileNotFoundException spoon map reduce failure on cdh5 centos</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/23572#M3183</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have cdh52 on centos and am using PDI 5.2(kettle) on one of the nodes. I'm following the example&amp;nbsp;job that Pentaho provided at&amp;nbsp;&lt;A target="_blank" href="http://wiki.pentaho.com/display/BAD/Using+Pentaho+MapReduce+to+Parse+Weblog+Data."&gt;http://wiki.pentaho.com/display/BAD/Using+Pentaho+MapReduce+to+Parse+Weblog+Data.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I used the command below to see the log detail since the job failed without clear error message:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;[daniel@n1 hadoop-yarn]$ yarn logs -applicationId application_1420841940959_0005&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And I see that&amp;nbsp;I'm having the same error message as you did below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;org.apache.commons.vfs.FileNotFoundException: Could not read from "file:///yarn/nm/usercache/daniel/appcache/application_1420841940959_0005/container_1420841940959_0005_01_000002/job.jar" because it is a not a file.&lt;BR /&gt;at org.apache.commons.vfs.provider.AbstractFileObject.getInputStream(Unknown Source)&lt;BR /&gt;at org.apache.commons.vfs.provider.DefaultFileContent.getInputStream(Unknown Source)&lt;BR /&gt;at org.apache.commons.vfs.provider.DefaultURLConnection.getInputStream(Unknown Source)&lt;BR /&gt;at java.net.URL.openStream(URL.java:1037)&lt;BR /&gt;at org.scannotation.archiveiterator.IteratorFactory.create(IteratorFactory.java:34)&lt;BR /&gt;at org.scannotation.AnnotationDB.scanArchives(AnnotationDB.java:291)&lt;BR /&gt;at org.pentaho.di.core.plugins.JarFileCache.getAnnotationDB(JarFileCache.java:58)&lt;BR /&gt;at org.pentaho.di.core.plugins.BasePluginType.findAnnotatedClassFiles(BasePluginType.java:258)&lt;BR /&gt;at org.pentaho.di.core.plugins.BasePluginType.registerPluginJars(BasePluginType.java:555)&lt;BR /&gt;at org.pentaho.di.core.plugins.BasePluginType.searchPlugins(BasePluginType.java:119)&lt;BR /&gt;at org.pentaho.di.core.plugins.PluginRegistry.registerType(PluginRegistry.java:570)&lt;BR /&gt;at org.pentaho.di.core.plugins.PluginRegistry.init(PluginRegistry.java:525)&lt;BR /&gt;at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironment.java:96)&lt;BR /&gt;at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:91)&lt;BR /&gt;at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:69)&lt;BR /&gt;at org.pentaho.hadoop.mapreduce.MRUtil.initKettleEnvironment(MRUtil.java:107)&lt;BR /&gt;at org.pentaho.hadoop.mapreduce.MRUtil.getTrans(MRUtil.java:66)&lt;BR /&gt;at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.createTrans(PentahoMapRunnable.java:221)&lt;BR /&gt;at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.configure(PentahoMapRunnable.java:193)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt;at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)&lt;BR /&gt;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt;at java.lang.reflect.Method.invoke(Method.java:606)&lt;BR /&gt;at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)&lt;BR /&gt;at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)&lt;BR /&gt;at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)&lt;BR /&gt;at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)&lt;BR /&gt;at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)&lt;BR /&gt;at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)&lt;BR /&gt;at java.security.AccessController.doPrivileged(Native Method)&lt;BR /&gt;at javax.security.auth.Subject.doAs(Subject.java:415)&lt;BR /&gt;at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)&lt;BR /&gt;at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)&lt;BR /&gt;Caused by: java.io.FileNotFoundException: /yarn/nm/usercache/daniel/appcache/application_1420841940959_0005/container_1420841940959_0005_01_000002/job.jar (Is a directory)&lt;BR /&gt;at java.io.FileInputStream.open(Native Method)&lt;BR /&gt;at java.io.FileInputStream.&amp;lt;init&amp;gt;(FileInputStream.java:146)&lt;BR /&gt;at org.apache.commons.vfs.provider.local.LocalFile.doGetInputStream(Unknown Source)&lt;BR /&gt;... 33 more&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Would you please share a little more detail on what you did in Kettle to make it run successfully? Were you referring to the "Mapper Input Step Name" of the "Pentaho Map Reduce" job? If then, what did you put for the field?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Daniel&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jan 2015 00:57:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/23572#M3183</guid>
      <dc:creator>interlee</dc:creator>
      <dc:date>2015-01-12T00:57:25Z</dc:date>
    </item>
    <item>
      <title>Re: org.apache.commons.vfs.FileNotFoundException spoon map reduce failure on cdh5 centos</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/23576#M3184</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As my comment above says, this error was caused for me by specifying the wrong data types for the key and value in the map reduce job. i.e. in&amp;nbsp; my case the key needed to be string and the value needed to be integer.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jan 2015 05:40:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-commons-vfs-FileNotFoundException-spoon-map/m-p/23576#M3184</guid>
      <dc:creator>mikejf</dc:creator>
      <dc:date>2015-01-12T05:40:40Z</dc:date>
    </item>
  </channel>
</rss>

