About amestry

amestry · ‎01-16-2018

Can you tell me exact error are you getting? To run the sample, unzip the contents of the attached zip. Here's a complete sample: curl -X POST -u admin:admin -H 'Content-Type: application/json' -d @type_def.json "http://localhost:21000/api/atlas/v2/types/typedefs" curl -X POST -u admin:admin -H 'Content-Type: application/json' -d @entity_def.json "http://localhost:21000/api/atlas/v2/entity" Navigate to this URL to see the entity in the web UI: http://localhost:21000/#!/search/searchResult?type=test_type&searchType=dsl&dslChecked=true Use this CURL call to fetch the entity. Replace the guid below with the entity guid you see within the UI: curl -X GET -u admin:admin -H 'Content-Type: application/json' "http://localhost:21000/api/atlas/v2/entity/bulk/?guid=d7858640-f681-4ed9-a4b5-cb4abe680483" Hope this helps.

amestry · ‎01-07-2018

@pbarna I worked on this as part of Atlas project. I realized that solution is not as simple given the numerous dependencies involved. Can you please tell me the version of Titan and Hadoop you are using? I attempted a similar exercise for using Titan 0.5.4 and Hadoop 2.6.3. My problem was to initiate Titan Index Repair job. This facility is built-in to Titan API. It uses MapReduce to initiate the repair. With some help, I realized that adding properties to yarn-site.xml and hbase-site.xml actually help. When you do update the properties in these files, be sure to use the <final>true</final> so that your settings override the default and take effect. Example: <property> <name>mapreduce.local.map.tasks.maximum</name> <value>10</value> <final>true</final> </property> Due to various reason I ended up writing a groovy script to achieve this. I can get into details if you are interested. My script is here. Please feel free to reach out if you think this was useful. Thanks @Nixon Rodrigues for letting me know about this question.

amestry · ‎12-04-2017

@Arsalan Siddiqi Thanks the excellent question. Your observations are valid. While Atlas does help with meeting compliance requirements, it is only part of the solution. To use traffic analogy, Atlas is the map (hence the name) and does not deal the cars on the road (traffic). To complete the picture, there needs to be some monitoring on what data gets ingested in the system and if all the data conforms with the norms setup. Please take a look at this presentation from Data Summit 2017. It explains how a system can be setup which helps with governance (realm of Atlas) and then also helps with spotting errors within the data itself. To summarize, to be able to spot errors with flow of data itself, you would need some other mechanism. Atlas will not help you in that respect. About your 2nd question: Atlas consumes notifications from Kafka by spawning a single thread and processing 1 notification at a time (see NotificationHookConsumer.java & AtlasKafkaConsumer.java). In case of systems with high throughput, the notifications will be queued with Kafa and you will see a lag in consumption of notifications. Kafka guarantees durability of messages. Atlas ensures that it consumes every message produced by Kafka. If messages are dropped for some reason, you would see them in Atlas' logs. We also test Atlas in high available scenarios. Also, to address the notification message question, I would urge you to use Atlas V2 client APIs (both on master and branch-0.8). Kafka does not mandate any message format since all it understands is bytes, so that should not be a determining criteria for choosing the client API version. I know this is a lot of text, I hope it helps. Please feel free to reach out if you need clarifications.

amestry · ‎12-01-2017

@Sankaranarayanan S We committed a fix yesterday. That makes the steps above unnecessary. I would urge you to attempt a build again. With this fix, you can use the default graph (JanusGraph). mvn clean install -DskipTests -Pdist,embedded-hbase-solr Once build is done, I uncompress tar produced in distro directory and then run bin/atlas_start.py. The initial initialization takes about 5 mins to happen due to Solr initializing indexes. Subsequent starts are quicker. Hope this helps.

amestry · ‎11-28-2017

@Sankaranarayanan S I spent the day yesterday trying to address this. Eventually we were able to get to a point where this configuration works, only that it needs few manual steps. Below are contents of the script that we discussed on the mailing list. It would be great if you could try it and let us know if this works for you. One build parameter that we are currently using using is to use titan0 instead of JanusGraph. ATLAS_SOURCE_DIR=/tmp/atlas-source ATLAS_HOME=/tmp/atlas-bin # Clone Apache Atlas sources mkdir -p ${ATLAS_SOURCE_DIR} cd ${ATLAS_SOURCE_DIR} git clone https://github.com/apache/atlas.git -b master # build Apache Atlas cd atlas mvn clean -DskipTests -DGRAPH_PROVIDER=titan0 install -Pdist,embedded-hbase-solr,graph-provider-titan0 # Install Apache Atlas mkdir -p ${ATLAS_HOME} tar xfz distro/target/apache-atlas-*bin.tar.gz --strip-components 1 -C ${ATLAS_HOME} # Setup environment and configuration export MANAGE_LOCAL_HBASE=true export MANAGE_LOCAL_SOLR=true export PATH=${PATH}:/tmp/atlas-bin/bin echo atlas.graphdb.backend=org.apache.atlas.repository.graphdb.titan0.Titan0GraphDatabase >> ${ATLAS_HOME}/conf/atlas-application.properties # Start Apache Atlas atlas_start.py # Access Apache Atlas at http://localhost:21000

amestry · ‎11-27-2017

Can you please use branch-0.8 instead of master? What version of Maven are you using? What configuration are you attempting? (Since you are copying je*.jar, I assume it is BerkeleyDB and ES combination. Can you please confirm.) Please build using this command line: mvn clean package -Pdist,berkeley-elasticsearch. This should create a properties file with correct configuration. The attached properties file atlas-application-berkeleyproperties.zip. I have a deployment directory structure where I have the properties file, and the libext set. I use these command line arguments to pass that information to Atlas. -Datlas.home=./deploy/ -Datlas.conf=./deploy/conf -Datlas.data=./deploy/data -Datlas.log.dir=./deploy/logs

amestry · ‎11-25-2017

That's strange. Would it be possible to attach the log? I am using data from the sandbox VM and I am able to run the DSL queries just fine. I will get back to your question on basic query. I need to confirm a few things with someone from my team.

amestry · ‎11-25-2017

Thanks for reaching out. If you are using building master, can I suggest that you try branch-0.8. Master has been updated to work with next version of Hadoop. It should be more stable in the coming weeks. Also, it would be great if you can attach the entire log. We use these settings for Solr: atlas.graph.index.search.backend=solr5 atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeper-url=localhost.localdomain:2181/infra-solr If entire startup sequence runs correctly you should see this line in your log: 2017-10-30 17:14:04,173 INFO - [main:] ~ Started SelectChannelConnector@0.0.0.0:21000 (AbstractConnector:338) Hope this helps.

amestry · ‎11-17-2017

@David Miller DSL query should help you with this. Each hive_table has hive_db name in its property. This DSL query should help you: hive_table where (db.name like '*_final' or db.name like '*_temp') About filtering deleted entities in DSL, there isn't a way to do it yet. We are in process of improving DSL. As for the documentation, I agree, it needs improvement. There is no firm ETA on this yet. Given the current state, my suggestion would be to use basic query as much as possible. Hope this helps.

amestry · ‎11-13-2017

@Arsalan Siddiqi Your observations are accurate. In fact there is one initiative in progress that is to address this. Only that I don't have an ETA on when it will get done.

Online	Offline
Last Visited	‎11-10-2020 03:01 PM

Member Since	‎11-22-2016 05:31 PM
Last Visited	‎11-10-2020 03:01 PM
Posts	83
Kudos received	23

Cloudera Community

Re: what is the difference between Entity , classi...

Re: QualifiedName vs guid

Re: Atlas Entity Version does not update after ent...

Re: How does apache atlas version entity metadata?

Re: API request to make a child tag/classification

Re: How to get bulk entities using v2 Rest Endpoin...

Re: How can yarn queue be specified for Titan mapr...

Re: Atlas Accountability/traceability & REST API P...

Re: Could Not able to Launch Apache Atlas ( Extern...

Re: Could Not able to Launch Apache Atlas ( Extern...

Re: Error while installing Apache Atlas ?

Re: How do I programmatically query atlas to retur...

Re: Could Not able to Launch Apache Atlas ( Extern...

Re: How do I programmatically query atlas to retur...

Re: Atlas Linegae diagrams