About thunderhemu

thunderhemu · ‎08-10-2017

I am able to read files from hdfs, but the problem was with hive table alone.

thunderhemu · ‎08-10-2017

I am able to read it now, just I have repaired the table, It is working fine. Thanks

thunderhemu · ‎08-09-2017

I am afraid when I can't read the data in to df from hive how can I save the data into a table

thunderhemu · ‎08-09-2017

I have a hive table created and data is in the following location /apps/hive/warehouse/temp.db/test2/c5=56/000000_0 When I query the hive table in spark I am getting java.io.FileNotFoundException. here is the log: Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File hdfs://si1/apps/hive/warehouse/temp.db/test1/c5=56/part-00000 does not exist. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998) ... 103 more Caused by: java.io.FileNotFoundException: File hdfs://si1/apps/hive/warehouse/temp.db/test1/c5=56/part-00000 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1062) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1040) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:985) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:981) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:981) at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1713) at org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:667) at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:361) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) I found the problem it is looking for part file which starts with part-* where as the data in the location is starting with 00000_0 Spark version - 1.6.2 hdp version 2.5.0.0-1245 Please advise

Online	Offline
Last Visited	‎02-06-2018 11:58 AM

Member Since	‎06-14-2017 12:56 PM
Last Visited	‎02-06-2018 11:58 AM
Posts	4

Cloudera Community

Re: Unable to query in spark

Re: Unable to query in spark

Re: Unable to query in spark

Unable to query in spark