Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Could Not able to Launch Apache Atlas ( External hbase-solr)

avatar
Contributor

I have followed http://atlas.apache.org/InstallationSteps.html to setup atlas (with external hbase-solr).

After starting the ./atlas_start.py, There were no more logs generated after "AuditFilter Initialization Started".

application log:

017-11-24 18:09:41,153 INFO  - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingDslQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,153 INFO  - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingFullTextQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,153 INFO  - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityDiscoveryService.searchUsingBasicQuery (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,176 INFO  - [main:] ~ GraphTransaction intercept for org.apache.atlas.discovery.EntityLineageService.getSchemaForHiveTableByGuid (GraphTransactionAdvisor$1:41)
2017-11-24 18:09:41,227 INFO  - [main:] ~ Starting service org.apache.atlas.web.service.ActiveInstanceElectorService (Services:53)
2017-11-24 18:09:41,227 INFO  - [main:] ~ HA is not enabled, no need to start leader election service (ActiveInstanceElectorService:96)
2017-11-24 18:09:41,228 INFO  - [main:] ~ Starting service org.apache.atlas.kafka.KafkaNotification (Services:53)
2017-11-24 18:09:41,228 INFO  - [main:] ~ Starting service org.apache.atlas.notification.NotificationHookConsumer (Services:53)
2017-11-24 18:09:41,228 INFO  - [main:] ~ HA is disabled, starting consumers inline. (NotificationHookConsumer:143)
2017-11-24 18:09:41,228 INFO  - [main:] ~ Consumer property: atlas.kafka.enable.auto.commit: null (KafkaNotification:275)
2017-11-24 18:09:41,346 WARN  - [main:] ~ The configuration hook.group.id = atlas was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,346 WARN  - [main:] ~ The configuration data = /home/ec2-user/sankar/atlas/distro/target/apache-atlas-1.0.0-SNAPSHOT-bin/apache-atlas-1.0.0-SNAPSHOT/data/kafka was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration zookeeper.connection.timeout.ms = 200 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration key.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration zookeeper.session.timeout.ms = 400 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration value.serializer = org.apache.kafka.common.serialization.StringSerializer was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration zookeeper.connect = 10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,347 WARN  - [main:] ~ The configuration zookeeper.sync.time.ms = 20 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,348 WARN  - [main:] ~ The configuration poll.timeout.ms = 1000 was supplied but isn't a known config. (AbstractConfig:186)
2017-11-24 18:09:41,416 INFO  - [main:] ~ Starting service org.apache.atlas.repository.audit.HBaseBasedAuditRepository (Services:53)
2017-11-24 18:09:41,417 INFO  - [NotificationHookConsumer thread-0:] ~ [atlas-hook-consumer-thread], Starting  (Logging$class:68)
2017-11-24 18:09:41,418 INFO  - [NotificationHookConsumer thread-0:] ~ ==> HookConsumer doWork() (NotificationHookConsumer$HookConsumer:305)
2017-11-24 18:09:41,418 INFO  - [NotificationHookConsumer thread-0:] ~ Atlas Server is ready, can start reading Kafka events. (NotificationHookConsumer$HookConsumer:508)
2017-11-24 18:09:41,437 INFO  - [main:] ~ HA is disabled. Hence creating table on startup. (HBaseBasedAuditRepository:384)
2017-11-24 18:09:41,438 INFO  - [main:] ~ Checking if table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:343)
2017-11-24 18:09:41,447 INFO  - [main:] ~ Table apache_atlas_entity_audit exists (HBaseBasedAuditRepository:355)
2017-11-24 18:09:41,835 INFO  - [main:] ~ AuditFilter initialization started (AuditFilter:57)

config File:

# Graph Database
#Configures the graph database to use.  Defaults to JanusGraph 0.1.1
#atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase
# Graph Storage
atlas.graph.storage.backend=hbase
atlas.graph.storage.hbase.table=apache_atlas_titan
#Hbase
#For standalone mode , specify localhost
#for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2
atlas.graph.storage.hostname=10.115.80.165,10.115.80.168,10.115.80.97
atlas.graph.storage.hbase.regions-per-server=1
atlas.graph.storage.lock.wait-time=10000
# Gremlin Query Optimizer
#
# Enables rewriting gremlin queries to maximize performance. This flag is provided as
# a possible way to work around any defects that are found in the optimizer until they
# are resolved.
#atlas.query.gremlinOptimizerEnabled=true
# Delete handler
#
# This allows the default behavior of doing "soft" deletes to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.graph.SoftDeleteHandler - all deletes are "soft" deletes
# org.apache.atlas.repository.graph.HardDeleteHandler - all deletes are "hard" deletes
#
#atlas.DeleteHandler.impl=org.apache.atlas.repository.graph.SoftDeleteHandler
# Entity audit repository
#
# This allows the default behavior of logging entity changes to hbase to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase
# org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository
#
#atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.NoopEntityAuditRepository
#org.apache.atlas.repository.audit.HBaseBasedAuditRepository
# Graph Search Index
atlas.graph.index.search.backend=solr
#Solr
#Solr cloud mode properties
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181
atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
atlas.graph.index.search.solr.zookeeper-session-timeout=60000
#Solr http mode properties
atlas.graph.index.search.solr.mode=http
atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr
# Solr-specific configuration property
atlas.graph.index.search.max-result-set-size=150
#########  Notification Configs  #########
atlas.notification.embedded=false
atlas.kafka.data=${sys:atlas.home}/data/kafka
atlas.kafka.zookeeper.connect=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181
#localhost:9026
atlas.kafka.bootstrap.servers=10.115.80.165:9092
atlas.kafka.zookeeper.session.timeout.ms=400
atlas.kafka.zookeeper.connection.timeout.ms=200
atlas.kafka.zookeeper.sync.time.ms=20
atlas.kafka.auto.commit.interval.ms=1000
atlas.kafka.hook.group.id=atlas
atlas.kafka.enable.auto.commit=false
atlas.kafka.auto.offset.reset=earliest
atlas.kafka.session.timeout.ms=30000
atlas.kafka.poll.timeout.ms=1000
atlas.notification.create.topics=true
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.notification.log.failed.messages=true
atlas.notification.consumer.retry.interval=500
atlas.notification.hook.retry.interval=1000
# Enable for Kerberized Kafka clusters
#atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
#########  Hive Lineage Configs  #########
## Schema
atlas.lineage.schema.query.hive_table=hive_table where __guid='%s'\, columns
atlas.lineage.schema.query.Table=Table where __guid='%s'\, columns
## Server port configuration
#atlas.server.http.port=21000
#atlas.server.https.port=21443
#########  Security Properties  #########
# SSL config
atlas.enableTLS=false
#truststore.file=/path/to/truststore.jks
#cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks
#########  Server Properties  #########
atlas.rest.address=http://10.115.80.165:21000
# If enabled and set to true, this will run setup steps when the server starts
#atlas.server.run.setup.on.start=false
#########  Entity Audit Configs  #########
atlas.audit.hbase.tablename=apache_atlas_entity_audit
atlas.audit.zookeeper.session.timeout.ms=1000
atlas.audit.hbase.zookeeper.quorum=10.115.80.165:2181,10.115.80.168:2181,10.115.80.97:2181

I tried to hit the application url : ( say, http://localhost:21000). Its throwing error.

wget http://10.115.80.165:21000 --no-proxy 
--2017-11-24 17:54:05--  http://10.x.x.x:21000/ Connecting to 10.x.x.x:21000... connected. 
HTTP request sent, awaiting response... No data received.
Retrying.

I have validated the port being used by

netstat -tunlp | grep 21000 
tcp        0      0 0.0.0.0:21000           0.0.0.0:*               LISTEN      35415/java

I have no Idea how to proceed....

3 REPLIES 3

avatar
Expert Contributor

Thanks for reaching out.

If you are using building master, can I suggest that you try branch-0.8. Master has been updated to work with next version of Hadoop. It should be more stable in the coming weeks.

Also, it would be great if you can attach the entire log.

We use these settings for Solr:

atlas.graph.index.search.backend=solr5
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=localhost.localdomain:2181/infra-solr

If entire startup sequence runs correctly you should see this line in your log:

2017-10-30 17:14:04,173 INFO  - [main:] ~ Started SelectChannelConnector@0.0.0.0:21000 (AbstractConnector:338)

Hope this helps.

avatar
Expert Contributor
@Sankaranarayanan S

I spent the day yesterday trying to address this. Eventually we were able to get to a point where this configuration works, only that it needs few manual steps.

Below are contents of the script that we discussed on the mailing list. It would be great if you could try it and let us know if this works for you.

One build parameter that we are currently using using is to use titan0 instead of JanusGraph.

ATLAS_SOURCE_DIR=/tmp/atlas-source
ATLAS_HOME=/tmp/atlas-bin

# Clone Apache Atlas sources
mkdir -p ${ATLAS_SOURCE_DIR}
cd ${ATLAS_SOURCE_DIR}
git clone https://github.com/apache/atlas.git -b master

# build Apache Atlas
cd atlas
mvn clean -DskipTests -DGRAPH_PROVIDER=titan0 install -Pdist,embedded-hbase-solr,graph-provider-titan0
# Install Apache Atlas
mkdir -p ${ATLAS_HOME}
tar xfz distro/target/apache-atlas-*bin.tar.gz --strip-components 1 -C ${ATLAS_HOME}

# Setup environment and configuration
export MANAGE_LOCAL_HBASE=true
export MANAGE_LOCAL_SOLR=true
export PATH=${PATH}:/tmp/atlas-bin/bin

echo atlas.graphdb.backend=org.apache.atlas.repository.graphdb.titan0.Titan0GraphDatabase >> ${ATLAS_HOME}/conf/atlas-application.properties

# Start Apache Atlas
atlas_start.py

# Access Apache Atlas at http://localhost:21000

avatar
Expert Contributor
@Sankaranarayanan S

We committed a fix yesterday. That makes the steps above unnecessary.

I would urge you to attempt a build again. With this fix, you can use the default graph (JanusGraph).

mvn clean install -DskipTests -Pdist,embedded-hbase-solr

Once build is done, I uncompress tar produced in distro directory and then run bin/atlas_start.py.

The initial initialization takes about 5 mins to happen due to Solr initializing indexes. Subsequent starts are quicker.

Hope this helps.