Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Generating Unique ID using Zookeeper


Generating Unique ID using Zookeeper


Hi all.

Need to generate unique id's in our hadoop cluster during data ingestion.


We have parallel processes ingesting data from different sources into hive tables, we'd like a unique ID for each data row inserted.


I understand zookeper offers Unique ID generation for distributed scenarios.


Please help with how do we do this, can't find sample of documentation.


Also please let me know If there is a better distributed unique id generator in the cloudera environment




Re: Generating Unique ID using Zookeeper

Master Guru
Are you looking for a sequentially growing ID or just a universally unique ID?

For the former, you can use Curator over ZooKeeper with this recipe:

For the latter, a UUID generator may suffice.

For a more 'distributed' solution, checkout Twitter's Snowflake:
Don't have an account?
Coming from Hortonworks? Activate your account here