I have downloaded sandbox for waterline and already gone through tutorial for waterline which are present on hortonworks website.I just have threotical knowladge about waterline and now I am going to explore that tool practically but haven't seen anywhere about
how to use that sandbox machine?
What are the credentials for it's ?
Does waterline will give me hive column and table level lineage?
Can we implement data classification in waterli ne?
How to use waterline web ui? Anyone please send me link which explains end to end story of waterline?
Thank you Alex,
But do you have any idea about how to use sandbox machine on which they have installed waterline inventory ?.
I have read tutorials for waterline but not able understand how to use that sandbox with waterline.
If you tried out some hands-on over waterline then please send me links which you have referred.
I have not recently but you raised my curiosity :). Doing some research, their sandbox appears to built on top of HDP 2.2, so the functionaries HDP sandbox provides should also be available (Ambari, Hive etc..). I'm going to download it and take a look, I'll keep you posted.
If you're looking for an entity discovery/analytics/lineage product in general, I would suggest looking into Novetta, another partner of ours. https://hortonworks.com/wp-content/uploads/2014/05/Novetta-Entity-Analytics-and-Hortonworks-Solution...
They have a sandbox available on AWS marketplace also, built on HDP. It comes with a comprehensive tutorial as well.
@Manoj Dhake have you gone through this tutorial already?
Update: I checked with Waterline folks and they provided below guides on getting started with the sandbox
Thanks, @Alex Gauthier, you are right: the Waterline sandbox is built from the Hortonworks 2.2 sandbox, with as few changes as possible. The Waterline services should come up with the other Hadoop services when you start the VM image.
>Does waterline will give me hive column and table level lineage?
Waterline Data 2.1 (your sandbox) will provide inferred lineage among all "resources" in the inventory. That is, it will look at all tables and HDFS files and look for data overlaps. The 2.5.1 product profiles fact-based lineage between Hive tables and their constituent HDFS files and inferred lineage among Hive tables. (and inferred lineage among HDFS files). These versions do not handle column-level lineage. Technical details here: https://s3-us-west-1.amazonaws.com/wld-product-downloads/docs/docsv25/WaterlineDataInventory-Install...
>Can we implement data classification in waterline?
Waterline provides the ability to "tag" or label files, tables, and fields. When you use tags on fields, Waterline Data uses the field metadata and profiling information to match the field you tagged with other similar fields in the rest of your inventory ("tag discovery"). You can also use tags at the file and table level: those tags apply only to the resource you identify. As you tag data, you are also building a glossary of your semantic terms. If you design your glossary to reflect your data classification requirements, tagging data can be an effective way to build a semantic search engine.
>How to use waterline web ui? Anyone please send me link which explains end to end story of waterline?
There's a User Guide available through the product UI (question mark icon on the upper right). The tutorial that Ali pointed out is a pretty good overview of the key functionality.
You are welcome to log into support.waterlinedata.com to read the knowledge base or to ask questions.
Please also stop by our booth at Hadoop Summit: we'd be happy to give you a demo or answer additional questions.