<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to implement &amp;quot;connect By&amp;quot; of ORACLE in Hive ? OR create Hierarchie in HIVE . in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158285#M49105</link>
    <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/14740/sampatbudankayala.html"&gt;Sampat Budankayala&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Of course, you can use custom UDF, but those are not part of Hive core and performance is not guaranteed, especially for such expensive operation on a big data set. There is a reason that is not part of it. Iterative and recursive problems not well suited for map reduce
because tasks do not share state or coordinate with each other. &lt;/P&gt;&lt;P&gt;If you still want to go this path, in a few words, you would build the jar and deploy it to Hive auxiliary libraries folder or HDFS, then create a permanent or temporary function which you can invoke it in your SQL. Follow the steps described here: &lt;A href="https://dzone.com/articles/writing-custom-hive-udf-andudaf"&gt;https://dzone.com/articles/writing-custom-hive-udf-andudaf&lt;/A&gt;. Look also at this: &lt;A href="https://community.hortonworks.com/articles/39980/creating-a-hive-udf-in-java.html"&gt;https://community.hortonworks.com/articles/39980/creating-a-hive-udf-in-java.html&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;You would follow similar steps with the code you found, even is Scala. I am not aware of similar code implemented in Java, but it must be.&lt;/P&gt;</description>
    <pubDate>Tue, 20 Dec 2016 23:19:30 GMT</pubDate>
    <dc:creator>cstanca</dc:creator>
    <dc:date>2016-12-20T23:19:30Z</dc:date>
    <item>
      <title>How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158282#M49102</link>
      <description />
      <pubDate>Fri, 16 Dec 2016 18:34:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158282#M49102</guid>
      <dc:creator>sampyyy</dc:creator>
      <dc:date>2016-12-16T18:34:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158283#M49103</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/14740/sampatbudankayala.html"&gt;Sampat Budankayala&lt;/A&gt;&lt;/P&gt;&lt;P&gt;As you already know, Hive does not support sub-queries such as &lt;STRONG&gt;connect by.&lt;/STRONG&gt; Bad news, this is a general situation with similar tools in Hadoop ecosystem. &lt;/P&gt;&lt;P&gt;Join works if you know the number of levels and the query is quite ugly.&lt;/P&gt;&lt;P&gt;If you need hierarchical queries against databases that don't support Recursive Subquery Factoring, one very low-tech alternative is to encode the hierarchy into a separate key. Of course this will only work if you can control the table update process and rewrite the key following parent updates.&lt;/P&gt;&lt;P&gt;Your option is to take the hierarchical data, import it onto an RDBMS suited for &lt;STRONG&gt;connect by&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;***&lt;/P&gt;&lt;P&gt;If response helped, please vote/accept best answer.&lt;/P&gt;</description>
      <pubDate>Sat, 17 Dec 2016 09:16:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158283#M49103</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-12-17T09:16:44Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158284#M49104</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;@Constantin Stanca&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I have come across a certain blog which implements a scala code to generate the Hierarchal data In hive using UDTF.&lt;/P&gt;&lt;P&gt;But I have come acrss this below source code. But not sure How to execute or implement this.&lt;/P&gt;&lt;P&gt;&lt;CODE&gt;class&lt;/CODE&gt; &lt;CODE&gt;ExpandTree&lt;/CODE&gt;&lt;CODE&gt;2&lt;/CODE&gt;&lt;CODE&gt;UDTF &lt;/CODE&gt;&lt;CODE&gt;extends&lt;/CODE&gt; &lt;CODE&gt;GenericUDTF {&lt;/CODE&gt;
&lt;CODE&gt;var&lt;/CODE&gt; &lt;CODE&gt;inputOIs&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;Array[PrimitiveObjectInspector] &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;null&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;tree&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;collection.mutable.Map[String,Option[String]] &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;collection.mutable.Map()&lt;/CODE&gt;
&lt;CODE&gt;override&lt;/CODE&gt; &lt;CODE&gt;def&lt;/CODE&gt; &lt;CODE&gt;initialize(args&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;Array[ObjectInspector])&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;StructObjectInspector &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;{&lt;/CODE&gt;
&lt;CODE&gt;inputOIs &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;args.map{&lt;/CODE&gt;&lt;CODE&gt;_&lt;/CODE&gt;&lt;CODE&gt;.asInstanceOf[PrimitiveObjectInspector]}&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;fieldNames &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;java.util.Arrays.asList(&lt;/CODE&gt;&lt;CODE&gt;"id"&lt;/CODE&gt;&lt;CODE&gt;, &lt;/CODE&gt;&lt;CODE&gt;"ancestor"&lt;/CODE&gt;&lt;CODE&gt;, &lt;/CODE&gt;&lt;CODE&gt;"level"&lt;/CODE&gt;&lt;CODE&gt;)&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;fieldOI &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;primitive.PrimitiveObjectInspectorFactory.javaStringObjectInspector.asInstanceOf[ObjectInspector]&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;fieldOIs &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;java.util.Arrays.asList(fieldOI, fieldOI, fieldOI)&lt;/CODE&gt;
&lt;CODE&gt;ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs);&lt;/CODE&gt;
&lt;CODE&gt;}&lt;/CODE&gt;
&lt;CODE&gt;def&lt;/CODE&gt; &lt;CODE&gt;process(record&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;Array[Object]) {&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;id &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;inputOIs(&lt;/CODE&gt;&lt;CODE&gt;0&lt;/CODE&gt;&lt;CODE&gt;).getPrimitiveJavaObject(record(&lt;/CODE&gt;&lt;CODE&gt;0&lt;/CODE&gt;&lt;CODE&gt;)).asInstanceOf[String]&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;parent &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;Option(inputOIs(&lt;/CODE&gt;&lt;CODE&gt;1&lt;/CODE&gt;&lt;CODE&gt;).getPrimitiveJavaObject(record(&lt;/CODE&gt;&lt;CODE&gt;1&lt;/CODE&gt;&lt;CODE&gt;)).asInstanceOf[String])&lt;/CODE&gt;
&lt;CODE&gt;tree +&lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;( id -&amp;gt; parent )&lt;/CODE&gt;
&lt;CODE&gt;}&lt;/CODE&gt;
&lt;CODE&gt;def&lt;/CODE&gt; &lt;CODE&gt;close {&lt;/CODE&gt;
&lt;CODE&gt;val&lt;/CODE&gt; &lt;CODE&gt;expandTree &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;collection.mutable.Map[String,List[String]]()&lt;/CODE&gt;
&lt;CODE&gt;def&lt;/CODE&gt; &lt;CODE&gt;calculateAncestors(id&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;String)&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;List[String] &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt;
&lt;CODE&gt;tree(id) &lt;/CODE&gt;&lt;CODE&gt;match&lt;/CODE&gt; &lt;CODE&gt;{ &lt;/CODE&gt;&lt;CODE&gt;case&lt;/CODE&gt; &lt;CODE&gt;Some(parent) &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt;&lt;CODE&gt;&amp;gt; id &lt;/CODE&gt;&lt;CODE&gt;::&lt;/CODE&gt; &lt;CODE&gt;getAncestors(parent) ; &lt;/CODE&gt;&lt;CODE&gt;case&lt;/CODE&gt; &lt;CODE&gt;None &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt;&lt;CODE&gt;&amp;gt; List(id) }&lt;/CODE&gt;
&lt;CODE&gt;def&lt;/CODE&gt; &lt;CODE&gt;getAncestors(id&lt;/CODE&gt;&lt;CODE&gt;:&lt;/CODE&gt; &lt;CODE&gt;String) &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt; &lt;CODE&gt;expandTree.getOrElseUpdate(id, calculateAncestors(id))&lt;/CODE&gt;
&lt;CODE&gt;tree.keys.foreach{ id &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt;&lt;CODE&gt;&amp;gt; getAncestors(id).zipWithIndex.foreach{ &lt;/CODE&gt;&lt;CODE&gt;case&lt;/CODE&gt;&lt;CODE&gt;(ancestor,level) &lt;/CODE&gt;&lt;CODE&gt;=&lt;/CODE&gt;&lt;CODE&gt;&amp;gt; forward(Array(id, ancestor, level)) } }&lt;/CODE&gt;
&lt;CODE&gt;}&lt;/CODE&gt;
&lt;CODE&gt;}&lt;/CODE&gt;
&lt;/P&gt;</description>
      <pubDate>Mon, 19 Dec 2016 21:22:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158284#M49104</guid>
      <dc:creator>sampyyy</dc:creator>
      <dc:date>2016-12-19T21:22:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158285#M49105</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/14740/sampatbudankayala.html"&gt;Sampat Budankayala&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Of course, you can use custom UDF, but those are not part of Hive core and performance is not guaranteed, especially for such expensive operation on a big data set. There is a reason that is not part of it. Iterative and recursive problems not well suited for map reduce
because tasks do not share state or coordinate with each other. &lt;/P&gt;&lt;P&gt;If you still want to go this path, in a few words, you would build the jar and deploy it to Hive auxiliary libraries folder or HDFS, then create a permanent or temporary function which you can invoke it in your SQL. Follow the steps described here: &lt;A href="https://dzone.com/articles/writing-custom-hive-udf-andudaf"&gt;https://dzone.com/articles/writing-custom-hive-udf-andudaf&lt;/A&gt;. Look also at this: &lt;A href="https://community.hortonworks.com/articles/39980/creating-a-hive-udf-in-java.html"&gt;https://community.hortonworks.com/articles/39980/creating-a-hive-udf-in-java.html&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;You would follow similar steps with the code you found, even is Scala. I am not aware of similar code implemented in Java, but it must be.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Dec 2016 23:19:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158285#M49105</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-12-20T23:19:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158286#M49106</link>
      <description>&lt;P&gt;If Constantin's awesome answer helped you, please accept the answer to close this thread, otherwise provide your solution or follow up questions for more clarity&lt;/P&gt;</description>
      <pubDate>Wed, 21 Dec 2016 20:51:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158286#M49106</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-12-21T20:51:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158287#M49107</link>
      <description>&lt;P&gt;That is the way to go&lt;/P&gt;</description>
      <pubDate>Sat, 24 Dec 2016 12:47:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-implement-quot-connect-By-quot-of-ORACLE-in-Hive-OR/m-p/158287#M49107</guid>
      <dc:creator>TimothySpann</dc:creator>
      <dc:date>2016-12-24T12:47:29Z</dc:date>
    </item>
  </channel>
</rss>

