Member since
01-31-2016
96
Posts
92
Kudos Received
20
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2737 | 02-11-2019 01:04 PM | |
3379 | 12-06-2018 01:19 PM | |
1979 | 08-23-2018 06:22 AM | |
1799 | 08-09-2018 11:29 AM | |
2543 | 03-29-2018 04:55 PM |
07-11-2017
10:39 AM
1 Kudo
@Saba Baig , >> But when I tried to the same search (Basic using query) I typed sales_fa?t and I got invalid Expression What is the query you fired ? You mentioned http://localhost:21000/api/atlas/v2/search/basic?limit=25∈cludeDeletedEntities=true&query=sales_fa?t&typeName=Table works for you. I am not able to get your question.
... View more
07-11-2017
08:30 AM
1 Kudo
@Saba Baig >> Also, what features e.g wildcards (? . * etc) are available in full-text provided by Atlas? and how can we use them (examples)? Atlas uses lucene query syntax for its full text search (basic search). Please refer lucene query usage here : http://lucene.apache.org/core/4_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Wildcard_Searches Example: If I have 3 hive_table entities in Atlas , table_a , table_b , table_ab Query :
1)
type = hive_table , query = table_?
http://localhost:21000/api/atlas/v2/search/basic?limit=25&excludeDeletedEntities=true&query=table_?&typeName=hive_table
Returns 2 entities (table_a , table_b)
2)
type = hive_table , query = table_*
http://localhost:21000/api/atlas/v2/search/basic?limit=25&excludeDeletedEntities=true&query=table_*&typeName=hive_table
Returns 3 entities (table_a , table_b , table_ab) I hope this example makes it clear ... Note : Wild card search can be done on UI itself.
... View more
07-11-2017
08:20 AM
1 Kudo
@Saba BaigFew aggregate functions are not working in V2 APIs. There are few Apache JIRAs filed for the same : https://issues.apache.org/jira/browse/ATLAS-1874 https://issues.apache.org/jira/browse/ATLAS-1639 However V1 APIs work perfectly . For example :
Query : hive_table groupby(owner) select owner,count()
URL encoded :
http://localhost:21000/api/atlas/discovery/search/dsl?limit=25&query=hive_table+groupby(owner)+select+owner%2Ccount()
Response :
{requestId: "pool-2-thread-9 - 8a45b7a6-0cd9-4e9d-8b71-66e313f5abf1",
query: "hive_table groupby(owner) select owner,count()",
queryType: "dsl",
count: 2,
results:
[
{$typeName$: "__tempQueryResultStruct74",
count(): 9,
owner: "hrt_qa"
},
{$typeName$: "__tempQueryResultStruct74",
count(): 2,
owner: "anonymous"
}
],
dataType:
{typeName: "__tempQueryResultStruct74",
typeDescription: null,
typeVersion: "1.0",
attributeDefinitions:
[
{name: "owner",
dataTypeName: "string",
multiplicity:
{lower: 0,
upper: 1,
isUnique: false
},
isComposite: false,
isUnique: false,
isIndexable: false,
reverseAttributeName: null
},
{name: "count()",
dataTypeName: "long",
multiplicity:
{lower: 0,
upper: 1,
isUnique: false
},
isComposite: false,
isUnique: false,
isIndexable: false,
reverseAttributeName: null
}
]
}
}
This query groups the owner of hive_table and and lists the owner ,count in each group . The concept is similar to that of SQL. Since the UI currently uses V2 query , you may not see the results on UI. But you can always rely on REST APIs. For this particular kind of queries , please use V1 APIs for now .
... View more
07-10-2017
01:05 PM
1 Kudo
@Saba Baig The select lets you "select" a particular attribute of an entity DB where name="Reporting" select name, owner displays the name and owner of db whose name is "Reporting" Currently UI doesn't display results but REST GET API call actually returns the result. Example : DSL query : hive_table name = "table1" select qualifiedName,owner response : {queryType: "DSL",
queryText: "`hive_table` name="table1" select qualifiedName,owner",
attributes:
{name:
["qualifiedName",
"owner"
],
values:
[
["default.table1@cl1",
"atlas"
]
]
}
}
... View more
06-29-2017
10:40 AM
3 Kudos
@subash sharma Try this : 1. Changed "" to None 2.Changed String false , true to boolean True ,False 3. Added typeDescription and typeVersion create_type = {
"enumTypes": [],
"structTypes": [],
"traitTypes": [
{
"superTypes": [],
"hierarchicalMetaTypeName":
"org.apache.atlas.typesystem.types.TraitType",
"typeName": "EXPIRES_ON",
"typeDescription": None,
"typeVersion": "1.0",
"attributeDefinitions": [
{
"name": "expiry_date",
"dataTypeName": "date",
"multiplicity": "required",
"isComposite": False,
"isUnique":False,
"isIndexable": True,
"reverseAttributeName": None}
]
}
],
"classTypes": []
}
... View more
06-23-2017
04:00 PM
1 Kudo
@Smart Data Can you please try with "--broker-list localhost:6667" ? broker seems to be running on port 6667. To verify the port number on which kafka broker is running , get into zookeeper client shell using $ZOOKEEPER_HOME/bin/zkCli.sh and get the broker port. The following image is taken after running zookeeper client shell and note that get /brokers/ids/0 lists the port. Best way to check the processes running on a port is using lsof -i:6667. In your case , "kakfa" in the 3rd column of output lsof -i -P -n | grep kafka is the kafka user and not the process itself. Also , best practice is to use the hostname itself instead of "localhost".
... View more
06-23-2017
02:27 PM
3 Kudos
@Smart Data , Can you please check if the port 9092 is correct in the broker list and the broker is up and running ? Also , is your cluster kerberized ? Please note that : The issue has nothing to do with Atlas or Ranger . When Atlas starts up , it creates 2 kafka topics ATLAS_HOOK and ATLAS_ENTITIES . The user doesn't have to create any topic.
... View more
06-15-2017
08:41 AM
3 Kudos
@Saba Baig >> How Hive notifies Atlas about any DML/DDL operation in Atlas against which Atlas generates lineage? Whenever there is any metadata change events in Hive , HiveHook captures it and puts the details of created/updated hive entity to a kafka topic called ATLAS_HOOK. Atlas is the consumer of the ATLAS_HOOK. So Atlas gets the message from ATLAS_HOOK. >> what is the information that Hive sends to Atlas? Example : hive > create table emp(id int,name string); 1.HiveHook composes a JSON message that contains information about table name , database , columns and other table properties and sends it to ATLAS_HOOK. 2. ATLAS_HOOK queues up the messages from HiveHook and Atlas consumes from it. Atlas consumes the JSON message about table emp and ingests it. hive > create table t_emp as select * from emp; 1.HiveHook composes JSON message that contains t_emp details and also the source table name (emp) and sends to ATLAS_HOOK. 2.Atlas understands from the JSON message consumed from ATLAS_HOOK , that it is a CTAS table and it has a source table , ingests the table t_emp and constructs lineage for the tables emp and t_emp. >> is Hive DB going to notify Atlas Server on its own or HiveHook is going to check constantly in the Hive DB and pull the changes HiveHook doesn't check hive constantly all time. Whenever there is any metadata event change ( like when user fires a hive query that involves creation/updation/drop ) , HiveHook notifies ATLAS_HOOK. NOTE : If you want to know more about the exact JSON content sent by HiveHook , you can create a table in hive and check the message that lands in ATLAS_HOOK for that table.
... View more
06-14-2017
09:17 AM
1 Kudo
@subash sharma , DSL query makes GREMLIN call internally to fetch results.
... View more
06-12-2017
10:01 AM
1 Kudo
@subash sharma , limit parameter should work . How many tables are there in the database you are querying?
... View more