Member since
11-16-2016
10
Posts
0
Kudos Received
0
Solutions
12-15-2016
04:39 PM
Are you talking about the sequence number generation in HIVE? And you are telling it can be done with large datasets? By the way can you please send me your email or contact number? Thanks
... View more
12-15-2016
04:02 PM
The number of mappers cannot be restricted to 1 since the number of mappers depend on the data which is number of input splits.
... View more
12-15-2016
04:01 PM
Hi Manoj-I saw your post and are you sure that your code will work?
... View more
12-15-2016
04:01 PM
Dear All, I need to generate the surrogate keys(which are sequential) in HIVE.I do not want to go and use the method (java.util.UUID) since it generates 33 bytes of lenghth and i do not want that much length(Though it was unique).So predominantly i can do that in 2 ways if i am not wrong: 1st Method: To restrict the number of mappers to 1 and use the code for random unique generation of UDF in HIVE.The code repository as mentioned in the above post. https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java 2nd Method:
1)Load data to a temp hive table
2)Lets get the max value in the table
3)Lets use row_number over() to generate row_number+1 and load to actual table. **** Please Note : I think we cannot restrict the number of mappers to 1 and hence both the methods fail for the creation of unique keys. **** Please let me know if there is a way to do this? Thanks, Lalit
... View more
12-15-2016
03:57 PM
Dear All, I need to generate the surrogate keys(which are sequential) in HIVE.I do not want to go and use the method (java.util.UUID) since it generates 33 bytes of lenghth and i do not want that much length(Though it was unique).So predominantly i can do that in 2 ways if i am not wrong: 1st Method: To restrict the number of mappers to 1 and use the code for random unique generation of UDF in HIVE.The code repository as mentioned in the above post. https://svn.apache.org/repos/asf/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/udf/UDFRowSequence.java 2nd Method:
1)Load data to a temp hive table
2)Lets get the max value in the table
3)Lets use row_number over() to generate
row_number+1 and load to actual table. **** Please Note : I think we cannot restrict the number of mappers to 1 and hence both the methods fail for the creation of unique keys. **** Please let me know if there is a way to do this? Thanks, Lalit
... View more