Created 11-24-2017 08:04 PM
I have followed http://atlas.apache.org/InstallationSteps.html to setup atlas (with external hbase-solr).
After starting the ./atlas_start.py, There were no more logs generated after "AuditFilter Initialization Started".
application log:
017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingDslQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingFullTextQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,153 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingBasicQuery (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,176 INFO - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityLineageService.getSchemaForHiveTableByGuid (GraphTransactionAdvisor$1:41) 2017-11-24 18:09:41,227 INFO - [main:] ~ Starting service org.apache.atlas.web.service.ActiveInstanceElectorService (Services:53) 2017-11-24 18:09:41,227 INFO - [main:] ~ HA is not enabled, no need to start leader election service (ActiveInstanceElectorService:96) 2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.kafka.KafkaNotification (Services:53) 2017-11-24 18:09:41,228 INFO - [main:] ~ Starting service org.apache.atlas.notification.NotificationHookConsumer (Services:53) 2017-11-24 18:09:41,228 INFO - [main:] ~ HA is disabled, starting consumers inline. (NotificationHookConsumer:143) 2017-11-24 18:09:41,228 INFO - [main:] ~ Consumer property: atlas.kafka.enable.auto.commit: null (KafkaNotification:275) 2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration hook.group.id = atlas was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,346 WARN - [main:] ~ The configuration data = /home/ec2-user/sankar/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/data/kafka was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connection.timeout.ms = 200 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration key.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.session.timeout.ms = 400 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration value.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.connect = 10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,347 WARN - [main:] ~ The configuration zookeeper.sync.time.ms = 20 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,348 WARN - [main:] ~ The configuration poll.timeout.ms = 1000 was supplied but isn't a known config. (AbstractConfig:186) 2017-11-24 18:09:41,416 INFO - [main:] ~ Starting service org.apache.atlas.repository.audit.HBaseBasedAuditRepository (Services:53) 2017-11-24 18:09:41,417 INFO - [NotificationHookConsumer thread-0:] ~ [atlas-hook-consumer-thread], Starting (Logging$class:68) 2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ ==> HookConsumer doWork() (NotificationHookConsumer$HookConsumer:305) 2017-11-24 18:09:41,418 INFO - [NotificationHookConsumer thread-0:] ~ Atlas Server is ready, can start reading Kafka events. (NotificationHookConsumer$HookConsumer:508) 2017-11-24 18:09:41,437 INFO - [main:] ~ HA is disabled. Hence creating table on startup. (HBaseBasedAuditRepository:384) 2017-11-24 18:09:41,438 INFO - [main:] ~ Checking if table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:343) 2017-11-24 18:09:41,447 INFO - [main:] ~ Table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:355) 2017-11-24 18:09:41,835 INFO - [main:] ~ AuditFilter initialization started (AuditFilter:57)
config File:
# Graph Database #Configures the graph database to use. Defaults to JanusGraph 0.1.1 #atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase # Graph Storage atlas.graph.storage.backend=hbase atlas.graph.storage.hbase.table=apache_atlas_titan #Hbase #For standalone mode , specify localhost #for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2 atlas.graph.storage.hostname=10.115.80.165,10.115.80.168,10.115.80.97 atlas.graph.storage.hbase.regions-per-server=1 atlas.graph.storage.lock.wait-time=10000 # Gremlin Query Optimizer # # Enables rewriting gremlin queries to maximize performance. This flag is provided as # a possible way to work around any defects that are found in the optimizer until they # are resolved. #atlas.query.gremlinOptimizerEnabled=true # Delete handler # # This allows the default behavior of doing "soft" deletes to be changed. # # Allowed Values: # org.apache.atlas.repository.graph.SoftDeleteHandler - all deletes are "soft" deletes # org.apache.atlas.repository.graph.HardDeleteHandler - all deletes are "hard" deletes # #atlas.DeleteHandler.impl=org.apache.atlas.repository.graph.SoftDeleteHandler # Entity audit repository # # This allows the default behavior of logging entity changes to hbase to be changed. # # Allowed Values: # org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase # org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository # #atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.NoopEntityAuditRepository #org.apache.atlas.repository.audit.HBaseBasedAuditRepository # Graph Search Index atlas.graph.index.search.backend=solr #Solr #Solr cloud mode properties atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeper-url=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 atlas.graph.index.search.solr.zookeeper-connect-timeout=60000 atlas.graph.index.search.solr.zookeeper-session-timeout=60000 #Solr http mode properties atlas.graph.index.search.solr.mode=http atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr # Solr-specific configuration property atlas.graph.index.search.max-result-set-size=150 ######### Notification Configs ######### atlas.notification.embedded=false atlas.kafka.data=${sys:atlas.home}/data/kafka atlas.kafka.zookeeper.connect=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 #localhost:9026 atlas.kafka.bootstrap.servers=10.115.80.165:9092 atlas.kafka.zookeeper.session.timeout.ms=400 atlas.kafka.zookeeper.connection.timeout.ms=200 atlas.kafka.zookeeper.sync.time.ms=20 atlas.kafka.auto.commit.interval.ms=1000 atlas.kafka.hook.group.id=atlas atlas.kafka.enable.auto.commit=false atlas.kafka.auto.offset.reset=earliest atlas.kafka.session.timeout.ms=30000 atlas.kafka.poll.timeout.ms=1000 atlas.notification.create.topics=true atlas.notification.replicas=1 atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES atlas.notification.log.failed.messages=true atlas.notification.consumer.retry.interval=500 atlas.notification.hook.retry.interval=1000 # Enable for Kerberized Kafka clusters #atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM #atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab ######### Hive Lineage Configs ######### ## Schema atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns ## Server port configuration #atlas.server.http.port=21000 #atlas.server.https.port=21443 ######### Security Properties ######### # SSL config atlas.enableTLS=false #truststore.file=/path/to/truststore.jks #cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks ######### Server Properties ######### atlas.rest.address=http://10.115.80.165:21000 # If enabled and set to true, this will run setup steps when the server starts #atlas.server.run.setup.on.start=false ######### Entity Audit Configs ######### atlas.audit.hbase.tablename=apache_atlas_entity_audit atlas.audit.zookeeper.session.timeout.ms=1000 atlas.audit.hbase.zookeeper.quorum=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181
I tried to hit the application url : ( say, http://localhost:21000). Its throwing error.
wget http://10.115.80.165:21000 --no-proxy --2017-11-24 17:54:05-- http://10.x.x.x:21000/ Connecting to 10.x.x.x:21000... connected. HTTP request sent, awaiting response... No data received. Retrying.
I have validated the port being used by
netstat -tunlp | grep 21000 tcp 0 0 0.0.0.0:21000 0.0.0.0:* LISTEN 35415/java
I have no Idea how to proceed....
Created 11-25-2017 09:43 PM
Thanks for reaching out.
If you are using building master, can I suggest that you try branch-0.8. Master has been updated to work with next version of Hadoop. It should be more stable in the coming weeks.
Also, it would be great if you can attach the entire log.
We use these settings for Solr:
atlas.graph.index.search.backend=solr5 atlas.graph.index.search.solr.mode=cloud atlas.graph.index.search.solr.zookeeper-url=localhost.localdomain:2181/infra-solr
If entire startup sequence runs correctly you should see this line in your log:
2017-10-30 17:14:04,173 INFO - [main:] ~ Started SelectChannelConnector@0.0.0.0:21000 (AbstractConnector:338)
Hope this helps.
Created 11-28-2017 05:19 PM
I spent the day yesterday trying to address this. Eventually we were able to get to a point where this configuration works, only that it needs few manual steps.
Below are contents of the script that we discussed on the mailing list. It would be great if you could try it and let us know if this works for you.
One build parameter that we are currently using using is to use titan0 instead of JanusGraph.
ATLAS_SOURCE_DIR=/tmp/atlas-source ATLAS_HOME=/tmp/atlas-bin # Clone Apache Atlas sources mkdir -p ${ATLAS_SOURCE_DIR} cd ${ATLAS_SOURCE_DIR} git clone https://github.com/apache/atlas.git -b master # build Apache Atlas cd atlas mvn clean -DskipTests -DGRAPH_PROVIDER=titan0 install -Pdist,embedded-hbase-solr,graph-provider-titan0 # Install Apache Atlas mkdir -p ${ATLAS_HOME} tar xfz distro/target/apache-atlas-*bin.tar.gz --strip-components 1 -C ${ATLAS_HOME} # Setup environment and configuration export MANAGE_LOCAL_HBASE=true export MANAGE_LOCAL_SOLR=true export PATH=${PATH}:/tmp/atlas-bin/bin echo atlas.graphdb.backend=org.apache.atlas.repository.graphdb.titan0.Titan0GraphDatabase >> ${ATLAS_HOME}/conf/atlas-application.properties # Start Apache Atlas atlas_start.py # Access Apache Atlas at http://localhost:21000
Created 12-01-2017 06:47 PM
We committed a fix yesterday. That makes the steps above unnecessary.
I would urge you to attempt a build again. With this fix, you can use the default graph (JanusGraph).
mvn clean install -DskipTests -Pdist,embedded-hbase-solr
Once build is done, I uncompress tar produced in distro directory and then run bin/atlas_start.py.
The initial initialization takes about 5 mins to happen due to Solr initializing indexes. Subsequent starts are quicker.
Hope this helps.