Member since
07-25-2018
174
Posts
29
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5414 | 03-19-2020 03:18 AM | |
3457 | 01-31-2020 01:08 AM | |
1337 | 01-30-2020 05:45 AM | |
2589 | 06-01-2016 12:56 PM | |
3073 | 05-23-2016 08:46 AM |
01-03-2017
12:05 PM
Hi Ayub, Here is the link for above same question, https://community.hortonworks.com/questions/75818/issue-regarding-apache-atlas-rest-api-to-create-hi.html
... View more
05-23-2017
08:31 AM
Please first validate your JSON using JSON Formatter and JSON Validator.
... View more
12-29-2016
05:54 AM
Hi Abdelkrim, now I am able to create hive table entities and successfully linked those entities also using atlas REST api. Please follow the step from below link: https://community.hortonworks.com/questions/74875/how-to-create-hive-table-entity-in-apache-atlas-us.html#comment-74964
... View more
11-25-2016
01:54 PM
8 Kudos
@Manoj Dhake Hi, Atlas and Falcon serve very different purposes, but there are some areas where they touch base. Maybe that is where your confusion comes from. Atlas: -really like an 'atlas' to almost all of the metadata that is around in HDP like Hive metastore, Falcon repo, Kafka topics, Hbase table etc. This single view on metadata makes for some powerfull searching capabilities on top of that with full text search (based on solr) -Since Atlas has this comprehensive view on metadata it is also capable of providing insight in lineage, so it can tell by combining Hive DDL's what table was the source for another table. -Another core feature is that you assign tags to all metadata entities on Atlas. So you can say that column B in Hive table Y holds sensitive data by assigning a 'PII' tag to it. But a hdfs folder can also be assigned a 'PII' tag or a CF from Hbase. From there you can create tag based policies from Ranger to manage access to anything 'PII' tagged in Atlas. Falcon: -more like a scheduling and execution engine for HDP components like Hive, Spark, hdfs distcp, Sqoop to move data around and/or process data along the way. In a way Falcon is a much improved Oozie. -metadata of Falcon dataflows is actually sinked to Atlas through Kafka topics so Atlas knows about Falcon metadata too and Atlas can include Falcon processes and its resulting meta objects (tables, hdfs folders, flows) into its lineage graphs. I know that in the docs both tools claim the term 'data governance', but I feel Atlas is more about that then Falcon is. It is not that clear what Data Governance actually is. With Atlas you can really apply governance by collecting all metadata querying and tagging it and Falcon can maybe execute processes that evolve around that by moving data from one place to another (and yes, Falcon moving a dataset from an analysis cluster to an archiving cluster is also about data governance/management) Hope that helps
... View more
09-21-2016
03:25 AM
you dont need any configuration params and you can submit spark job using curl commands
... View more
09-09-2016
01:14 PM
1 Kudo
For profiling data off Hadoop, see https://community.hortonworks.com/questions/35396/data-quality-analysis.html For profiling data on Hadoop, the best solution for you should be: zeppelin as your client/UI spark in zeppelin as your toolset to profile Both zeppelin and spark are extremely powerful tools for interacting with data and are packaged in HDP. Zeppelin is a browser-based notebook UI (like iPython/Jupyter) that excels at interacting with and exploring data. Spark of course is in-memory data analysis and is lightening fast. Both are key pieces in the future of Big Data analysis. BTW, you can use python in spark or you can use scala, including integration of external libraries. See the following links to get started: http://hortonworks.com/apache/zeppelin/ http://www.social-3.com/solutions/personal_data_profiling.php
... View more
04-12-2017
04:11 PM
1 Kudo
@Vadim Vaks I'm trying to do the same thing with Atlas 0.8. But I can't delete entries within inputs or outputs array with this method. With V2 API, elements didn't change. With V1 API, new elements are added even if I removed some from inputs array. The inputs had two entries before POST request, and I posted a single input entry and it got added: "inputs": [
{
"guid": "688ed1ee-222c-4416-8bf4-ba107b7fbc2c",
"typeName": "kafka_topic"
},
{
"guid": "bf3784db-fa59-4803-ad41-c5653f242f6f",
"typeName": "kafka_topic"
},
{
"guid": "688ed1ee-222c-4416-8bf4-ba107b7fbc2c",
"typeName": "kafka_topic"
}
], Please let me know how to remove elements from inputs/outputs with Atlas 0.8. Thanks!
... View more
07-29-2016
05:46 PM
@Manoj Dhake The best official explanation is not very detailed. http://atlas.incubator.apache.org/TypeSystem.html It's basically just: ### Traits is similar to scala - traits more like decorators (?) - traits get applied to instances - not classes - this satisfies the classification mechanism (ish) - can have a class instance have any number of traits - e.g. security clearance - any Person class could have it; so we add it as a mixin to the Person class - security clearance trait has a level attribute - traits are labels - each label can have its own attribute - reason for doing this is: - modeled security clearance trait - want to prescribe it to other things, too - can now search for anything that has security clearance level = 1, for instance ### On Instances: - class, trait, struct all have bags of attributes - can get name of type associated with attribute - can get or set the attribute in that bag for each instance If this answers your question, would you mind accepting the answer? Also, I provided answers to many of the other questions you asked over the last couple of weeks. Could you check the answers and accept if they were helpful or let me know what else I can clarify? I want to make sure I answered your questions. Thanks in advance.
... View more
07-22-2016
06:43 PM
@Manoj Dhake When you create a new Trait/Tag in the UI are are actually creating a new type. When you associate it with an entity you are creating an instance of the Trait as a STRUCT type that only exists in the context of that entity. As you know, you can delete instances of a tag that is assigned to an entity. However, to delete the tag so that it does not show up in Atlas UI you would need to delete the type. At the moment, types cannot be deleted since there may be entities or even other types that are dependent on them. You probably can delete the trait directly from HBase, however, you may also compromise the integrity of the entire Atlas system. You would also have to clear any dependents that are indexed in Solr or graphed in Titan. I am sure these are all issues that the community will have to solve in dealing implemeting the JIRA you referenced. To answer the follow-up question you asked in the comments sections, you can get the instance of a trait only from the entity where the trait is associated as follows: curl -u admin:admin -X GET http://sandbox.hortonworks.com:21000/api/atlas/entities/fadaca14-7e58-4a2e-b04d-95a3010ce45b/traits
{"requestId":"qtp921483514-312 - bb0f1276-c8fc-499b-ab5d-0e4fde11c796","results":["PII"],"count":1}[root@sandbox ~]# As you can see, this entity has a tag called PII and it does not have a GUID because it is STRUCT based on the PII type that represents PII trait. You can easily delete the STRUCT that is associated with an entity through the UI or REST api but there is now way to remove the type from which the trait is spawned for the reasons listed above.
... View more
07-21-2016
01:41 PM
Thank you very much Svekat, Issue has been resolved Actually on sandbox machine, ranger-tagsync service was disable. so i started service by using command: service ranger-tagsync start.
... View more