Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to implement "connect By" of ORACLE in Hive ? OR create Hierarchie in HIVE .

avatar
Explorer
 
1 ACCEPTED SOLUTION

avatar
Super Guru

@Sampat Budankayala

As you already know, Hive does not support sub-queries such as connect by. Bad news, this is a general situation with similar tools in Hadoop ecosystem.

Join works if you know the number of levels and the query is quite ugly.

If you need hierarchical queries against databases that don't support Recursive Subquery Factoring, one very low-tech alternative is to encode the hierarchy into a separate key. Of course this will only work if you can control the table update process and rewrite the key following parent updates.

Your option is to take the hierarchical data, import it onto an RDBMS suited for connect by.

***

If response helped, please vote/accept best answer.

View solution in original post

5 REPLIES 5

avatar
Super Guru

@Sampat Budankayala

As you already know, Hive does not support sub-queries such as connect by. Bad news, this is a general situation with similar tools in Hadoop ecosystem.

Join works if you know the number of levels and the query is quite ugly.

If you need hierarchical queries against databases that don't support Recursive Subquery Factoring, one very low-tech alternative is to encode the hierarchy into a separate key. Of course this will only work if you can control the table update process and rewrite the key following parent updates.

Your option is to take the hierarchical data, import it onto an RDBMS suited for connect by.

***

If response helped, please vote/accept best answer.

avatar
Explorer

@Constantin Stanca

I have come across a certain blog which implements a scala code to generate the Hierarchal data In hive using UDTF.

But I have come acrss this below source code. But not sure How to execute or implement this.

class ExpandTree2UDTF extends GenericUDTF { var inputOIs: Array[PrimitiveObjectInspector] = null val tree: collection.mutable.Map[String,Option[String]] = collection.mutable.Map() override def initialize(args: Array[ObjectInspector]): StructObjectInspector = { inputOIs = args.map{_.asInstanceOf[PrimitiveObjectInspector]} val fieldNames = java.util.Arrays.asList("id", "ancestor", "level") val fieldOI = primitive.PrimitiveObjectInspectorFactory.javaStringObjectInspector.asInstanceOf[ObjectInspector] val fieldOIs = java.util.Arrays.asList(fieldOI, fieldOI, fieldOI) ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames, fieldOIs); } def process(record: Array[Object]) { val id = inputOIs(0).getPrimitiveJavaObject(record(0)).asInstanceOf[String] val parent = Option(inputOIs(1).getPrimitiveJavaObject(record(1)).asInstanceOf[String]) tree += ( id -> parent ) } def close { val expandTree = collection.mutable.Map[String,List[String]]() def calculateAncestors(id: String): List[String] = tree(id) match { case Some(parent) => id :: getAncestors(parent) ; case None => List(id) } def getAncestors(id: String) = expandTree.getOrElseUpdate(id, calculateAncestors(id)) tree.keys.foreach{ id => getAncestors(id).zipWithIndex.foreach{ case(ancestor,level) => forward(Array(id, ancestor, level)) } } } }

avatar
Super Guru

@Sampat Budankayala

Of course, you can use custom UDF, but those are not part of Hive core and performance is not guaranteed, especially for such expensive operation on a big data set. There is a reason that is not part of it. Iterative and recursive problems not well suited for map reduce because tasks do not share state or coordinate with each other.

If you still want to go this path, in a few words, you would build the jar and deploy it to Hive auxiliary libraries folder or HDFS, then create a permanent or temporary function which you can invoke it in your SQL. Follow the steps described here: https://dzone.com/articles/writing-custom-hive-udf-andudaf. Look also at this: https://community.hortonworks.com/articles/39980/creating-a-hive-udf-in-java.html.

You would follow similar steps with the code you found, even is Scala. I am not aware of similar code implemented in Java, but it must be.

avatar
Master Guru

That is the way to go

avatar
Master Mentor

If Constantin's awesome answer helped you, please accept the answer to close this thread, otherwise provide your solution or follow up questions for more clarity