About lalit_ayyagari

lalit_ayyagari · ‎12-15-2016

Are you talking about the sequence number generation in HIVE? And you are telling it can be done with large datasets? By the way can you please send me your email or contact number? Thanks

lalit_ayyagari · ‎12-15-2016

The number of mappers cannot be restricted to 1 since the number of mappers depend on the data which is number of input splits.

lalit_ayyagari · ‎12-15-2016

Hi Manoj-I saw your post and are you sure that your code will work?

lalit_ayyagari · ‎12-15-2016

Dear All, I need to generate the surrogate keys(which are sequential) in HIVE.I do not want to go and use the method (java.util.UUID) since it generates 33 bytes of lenghth and i do not want that much length(Though it was unique).So predominantly i can do that in 2 ways if i am not wrong: 1st Method: To restrict the number of mappers to 1 and use the code for random unique generation of UDF in HIVE.The code repository as mentioned in the above post. https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java 2nd Method: 1)Load data to a temp hive table 2)Lets get the max value in the table 3)Lets use row_number over() to generate row_number+1 and load to actual table. **** Please Note : I think we cannot restrict the number of mappers to 1 and hence both the methods fail for the creation of unique keys. **** Please let me know if there is a way to do this? Thanks, Lalit

lalit_ayyagari · ‎12-15-2016

Dear All, I need to generate the surrogate keys(which are sequential) in HIVE.I do not want to go and use the method (java.util.UUID) since it generates 33 bytes of lenghth and i do not want that much length(Though it was unique).So predominantly i can do that in 2 ways if i am not wrong: 1st Method: To restrict the number of mappers to 1 and use the code for random unique generation of UDF in HIVE.The code repository as mentioned in the above post. https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java 2nd Method: 1)Load data to a temp hive table 2)Lets get the max value in the table 3)Lets use row_number over() to generate row_number+1 and load to actual table. **** Please Note : I think we cannot restrict the number of mappers to 1 and hence both the methods fail for the creation of unique keys. **** Please let me know if there is a way to do this? Thanks, Lalit

Online	Offline
Last Visited	‎06-12-2017 02:59 PM

Member Since	‎11-16-2016 03:58 AM
Last Visited	‎06-12-2017 02:59 PM
Posts	10

Cloudera Community

Re: Sequence number generation in Hive

Re: Sequence number generation in Hive

Re: Sequence number generation in Hive

Re: Sequence number generation in Hive

Re: Is there any default way of generating sequenc...