Member since
01-11-2015
1
Post
0
Kudos Received
0
Solutions
01-11-2015
04:57 PM
Hello, I have cdh52 on centos and am using PDI 5.2(kettle) on one of the nodes. I'm following the example job that Pentaho provided at http://wiki.pentaho.com/display/BAD/Using+Pentaho+MapReduce+to+Parse+Weblog+Data. I used the command below to see the log detail since the job failed without clear error message: [daniel@n1 hadoop-yarn]$ yarn logs -applicationId application_1420841940959_0005 And I see that I'm having the same error message as you did below: org.apache.commons.vfs.FileNotFoundException: Could not read from "file:///yarn/nm/usercache/daniel/appcache/application_1420841940959_0005/container_1420841940959_0005_01_000002/job.jar" because it is a not a file. at org.apache.commons.vfs.provider.AbstractFileObject.getInputStream(Unknown Source) at org.apache.commons.vfs.provider.DefaultFileContent.getInputStream(Unknown Source) at org.apache.commons.vfs.provider.DefaultURLConnection.getInputStream(Unknown Source) at java.net.URL.openStream(URL.java:1037) at org.scannotation.archiveiterator.IteratorFactory.create(IteratorFactory.java:34) at org.scannotation.AnnotationDB.scanArchives(AnnotationDB.java:291) at org.pentaho.di.core.plugins.JarFileCache.getAnnotationDB(JarFileCache.java:58) at org.pentaho.di.core.plugins.BasePluginType.findAnnotatedClassFiles(BasePluginType.java:258) at org.pentaho.di.core.plugins.BasePluginType.registerPluginJars(BasePluginType.java:555) at org.pentaho.di.core.plugins.BasePluginType.searchPlugins(BasePluginType.java:119) at org.pentaho.di.core.plugins.PluginRegistry.registerType(PluginRegistry.java:570) at org.pentaho.di.core.plugins.PluginRegistry.init(PluginRegistry.java:525) at org.pentaho.di.core.KettleClientEnvironment.init(KettleClientEnvironment.java:96) at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:91) at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:69) at org.pentaho.hadoop.mapreduce.MRUtil.initKettleEnvironment(MRUtil.java:107) at org.pentaho.hadoop.mapreduce.MRUtil.getTrans(MRUtil.java:66) at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.createTrans(PentahoMapRunnable.java:221) at org.pentaho.hadoop.mapreduce.PentahoMapRunnable.configure(PentahoMapRunnable.java:193) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.io.FileNotFoundException: /yarn/nm/usercache/daniel/appcache/application_1420841940959_0005/container_1420841940959_0005_01_000002/job.jar (Is a directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:146) at org.apache.commons.vfs.provider.local.LocalFile.doGetInputStream(Unknown Source) ... 33 more Would you please share a little more detail on what you did in Kettle to make it run successfully? Were you referring to the "Mapper Input Step Name" of the "Pentaho Map Reduce" job? If then, what did you put for the field? Thank you, Daniel
... View more