<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to read table into Spark using the Hive tablename, not HDFS filename? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121570#M34311</link>
    <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/44407/how-to-read-table-into-spark-using-the-hive-tablen.html#"&gt;@Greg Polanchyck&lt;/A&gt; if you have an existing ORC table in the Hive metastore, and you want to load the whole table into a Spark DataFrame, you can use the sql method on the hiveContext to run:&lt;/P&gt;&lt;PRE&gt;val test_enc_orc = hiveContext.sql("select * from test_enc_orc")&lt;/PRE&gt;</description>
    <pubDate>Mon, 11 Jul 2016 05:02:37 GMT</pubDate>
    <dc:creator>slachterman</dc:creator>
    <dc:date>2016-07-11T05:02:37Z</dc:date>
    <item>
      <title>How to read table into Spark using the Hive tablename, not HDFS filename?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121569#M34310</link>
      <description>&lt;P&gt;I successfully worked through Tutorial -400 (Using Hive with ORC from Apache Spark). But, what I would really like to do is to read established Hive ORC tables into Spark without having to know the HDFS path and filenames.   I created an ORC table in Hive, then did the following commands from the tutorial in scala, but from the exception, it appears that the read/load is expecting the HDFS  filename.   How do I read directly from the Hive table, not HDFS?  I searched, but could not find an existing answer.&lt;/P&gt;&lt;P&gt; Thanks much!&lt;/P&gt;&lt;P&gt;-Greg&lt;/P&gt;&lt;PRE&gt;hive&amp;gt; create table test_enc_orc stored as ORC as select * from test_enc;
hive&amp;gt; select count(*) from test_enc_orc; 
OK 
10

spark-shell --master yarn-client --driver-memory 512m --executor-memory 512m
import org.apache.spark.sql.hive.orc._
import org.apache.spark.sql._
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
val test_enc_orc = hiveContext.read.format("orc").load("test_enc_orc")

java.io.FileNotFoundException: File does not exist: 
hdfs://sandbox.hortonworks.com:8020/user/xxxx/test_enc_orc
        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1319)
&lt;/PRE&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:29:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121569#M34310</guid>
      <dc:creator>gpolanch</dc:creator>
      <dc:date>2022-09-16T10:29:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to read table into Spark using the Hive tablename, not HDFS filename?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121570#M34311</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/44407/how-to-read-table-into-spark-using-the-hive-tablen.html#"&gt;@Greg Polanchyck&lt;/A&gt; if you have an existing ORC table in the Hive metastore, and you want to load the whole table into a Spark DataFrame, you can use the sql method on the hiveContext to run:&lt;/P&gt;&lt;PRE&gt;val test_enc_orc = hiveContext.sql("select * from test_enc_orc")&lt;/PRE&gt;</description>
      <pubDate>Mon, 11 Jul 2016 05:02:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121570#M34311</guid>
      <dc:creator>slachterman</dc:creator>
      <dc:date>2016-07-11T05:02:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to read table into Spark using the Hive tablename, not HDFS filename?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121571#M34312</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/11295/slachterman.html"&gt;slachterman&lt;/A&gt;   Thank you very much !  That worked well !  -Greg&lt;/P&gt;</description>
      <pubDate>Mon, 11 Jul 2016 10:03:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121571#M34312</guid>
      <dc:creator>gpolanch</dc:creator>
      <dc:date>2016-07-11T10:03:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to read table into Spark using the Hive tablename, not HDFS filename?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121572#M34313</link>
      <description>&lt;P&gt;i m also having same problem giving error &lt;/P&gt;&lt;P&gt;INFO PerfLogger: &amp;lt;/PERFLOG method=OrcGetSplits start=1492763204120 end=1492763204592 duration=472 from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl&amp;gt; &lt;/P&gt;&lt;P&gt;Exception in thread "main" java.util.NoSuchElementException: next on empty iterator&lt;/P&gt;&lt;P&gt;at scala.collection.Iterator$anon$2.next(Iterator.scala:39)
at scala.collection.Iterator$anon$2.next(Iterator.scala:37)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:64)
at scala.collection.IterableLike$class.head(IterableLike.scala:91)
at scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$super$head(ArrayOps.scala:108)
at scala.collection.IndexedSeqOptimized$class.head(IndexedSeqOptimized.scala:120)
at scala.collection.mutable.ArrayOps$ofRef.head(ArrayOps.scala:108)
at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1422)
at org.apache.spark.sql.DataFrame.first(DataFrame.scala:1429)
at com.apollobit.jobs.TestData$.main(TestData.scala:32)
at com.apollobit.jobs.TestData.main(TestData.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)&lt;/P&gt;&lt;P&gt;Please any body can help????&lt;/P&gt;</description>
      <pubDate>Fri, 21 Apr 2017 20:03:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121572#M34313</guid>
      <dc:creator>tusharn184</dc:creator>
      <dc:date>2017-04-21T20:03:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to read table into Spark using the Hive tablename, not HDFS filename?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121573#M34314</link>
      <description>&lt;P&gt;I like this more &lt;/P&gt;&lt;PRE&gt;val test_enc_orc = hiveContext.table("test_enc_orc")&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Jul 2017 21:45:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-read-table-into-Spark-using-the-Hive-tablename-not/m-p/121573#M34314</guid>
      <dc:creator>eptakaktak</dc:creator>
      <dc:date>2017-07-27T21:45:16Z</dc:date>
    </item>
  </channel>
</rss>

