Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Generating Unique ID using Zookeeper

Highlighted

Generating Unique ID using Zookeeper

Explorer

Hi all.

Need to generate unique id's in our hadoop cluster during data ingestion.

 

We have parallel processes ingesting data from different sources into hive tables, we'd like a unique ID for each data row inserted.

 

I understand zookeper offers Unique ID generation for distributed scenarios.

 

Please help with how do we do this, can't find sample of documentation.

 

Also please let me know If there is a better distributed unique id generator in the cloudera environment

 

Thanks

1 REPLY 1

Re: Generating Unique ID using Zookeeper

Master Guru
Are you looking for a sequentially growing ID or just a universally unique ID?

For the former, you can use Curator over ZooKeeper with this recipe: https://curator.apache.org/curator-recipes/distributed-atomic-long.html

For the latter, a UUID generator may suffice.

For a more 'distributed' solution, checkout Twitter's Snowflake: https://github.com/twitter-archive/snowflake/tree/snowflake-2010
Don't have an account?
Coming from Hortonworks? Activate your account here