Support Questions
Find answers, ask questions, and share your expertise

How to update Atlas trait attribute values

How to update Atlas trait attribute values

Contributor

I want to use Atlas traits and attributes to hold data quality metadata (counts and dates).

I have multiple Hive tables and for each of them I run basic DQ scripts to count the number of anomalies for different DQ checks each day (at both table or column level). I only expect Atlas to hold the most recent date and count.

Example of the sort of DQ metadata I generate:

hive_tablehive_columnLoad dateDQ checkDQ count
table_1-2017-03-06Count number of records999
table_1column_12017-03-06Number of not nulls2
table_1column_22017-03-06Number of inconsistent dates0
table_2-2017-03-06Count number of records9999
table_2column_12017-03-06Number of not nulls232
table_2column_22017-03-06Number of inconsistent dates2

I have 2 questions.

1. What is the best way to structure the traits and attributes?

Traits:

  • dq_not_null; or
  • dq_not_null_table_column_nn

Attributes:

  • dq_count; or
  • table_column_dq_count

If I were to update attribute values for a trait that is linked to 2 entities (hive_tables) can each value be updated separately, or will the attribute value be shared across the trait? If it is shared then I will need unique trait names (I think).

2. How should I update the attribute values (the values are generated from HQL scripts)?

Here's an example of my traits and attributes (but not attribute values) for a DQ check for not nulls.

{
"enumTypes":[],
"structTypes":[],
"traitTypes":[
{
"superTypes":[],
"hierarchicalMetaTypeName":"org.apache.atlas.typesystem.types.TraitType",
"typeName":"dq_monitor_not_null",
"typeDescription":null,
"attributeDefinitions":[
{
"name":"dq_monitor_load_date",
"dataTypeName":"date",
"multiplicity":"optional",
"isComposite":false,
"isUnique":false,
"isIndexable":true,
"reverseAttributeName":null
},
  {
"name":"dq_monitor_count",
"dataTypeName":"int",
"multiplicity":"optional",
"isComposite":false,
"isUnique":false,
"isIndexable":true,
"reverseAttributeName":null
}
]
}
],
"classTypes":[]
}
2 REPLIES 2

Re: How to update Atlas trait attribute values

Guru

Each Atlas Tag can have multiple Attributes name/value pairs. If you had a tag with attribute called owner, you could tag 2 hive tables using the tag and then update each table to have different values.

ex.

Tag1 --> Hive Table 1

Owner = user1

Tag1 --> Hive Table 2

Owner = user2

Is this what you are asking?

Hope this is helpful.

Re: How to update Atlas trait attribute values

Guru

Each Atlas Tag can have multiple Attributes name/value pairs. If you had a tag with attribute called owner, you could tag 2 hive tables using the tag and then update each table to have different values.

ex.

Tag1 --> Hive Table 1

Owner = user1

Tag1 --> Hive Table 2

Owner = user2

Is this what you are asking?

Hope this is helpful.