Created 06-15-2016 06:13 AM
After built Atlas, I just only set the ATLAS_HOME_DIR in atlas-env.sh, and other settings, in atlas-env.sh and atlas-application.properties, are default.
I try to import metadata according http://atlas.apache.org/Bridge-Hive.html
After set $HIVE_CONF_DIR, I found that I can't set following configuration in atlas-application.properties.
<property> <name>atlas.cluster.name</name> <value>primary</value> </property>
This is a XML style code, but the atlas-application.properties is not XML style, so I can't add this.
I am wondering if the official guide of atlas is not accurate?
Then I skip this setting, run import-hive.sh. It showed following:
Exception in thread "main" org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient Caused by: java.lang.reflect.InvocationTargetException Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory NestedThrowables: java.lang.reflect.InvocationTargetException Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the "BONECP" plugin to create a ConnectionPool gave an error : The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver. Caused by: org.datanucleus.store.rdbms.connectionpool.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
in order to import metadata into Atlas, what could I do next ?
Created 06-15-2016 06:56 AM
Looks like there is typo in the documentation. The below config block should be added to hive-site.xml.
<property> <name>atlas.cluster.name</name> <value>primary</value> </property>
Also, the issue here is, metastoreclient requires "com.mysql.jdbc.Driver" class to be added to the classpath. So can you please download the appropriate jar for the above class(for example: http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.18/mysql-connector-java-5.1.18.jar) and place it under ATLAS_HOME/bridge/hive path. This should fix the issue.
Let me know if you face any issues further after following the above steps. Happy to help!!
-Ayub Khan
Created 06-25-2016 01:04 AM
@Ayub Pathan
These issues still exist.
Firstly, I type "hive" and enter the Hive CLI. When I type "show tables;", then it report the errors like this:
hive.exec.post.hooks Class not found:org.apache.atlas.hive.hook.HiveHook
Then export HIVE_AUX_JARS_PATH, and I run the services of hiveserver2 and metastore by using command: "hiveserver2" and "hive --service metastore".
It will report the errors like this:
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/gson/GsonBuilder
And the imported metadata also have no lineage data. What can I do next ?
As shown following, there is my atlas-application.properties :
(Most of them are default settings, I never change them. Should I delete the comments of atlas.lineage.*.*.*?)
######### Graph Database Configs ######### # Graph Storage #atlas.graph.storage.backend=berkeleyje #atlas.graph.storage.directory=${sys:atlas.home}/data/berkley #Hbase as stoarge backend atlas.graph.storage.backend=hbase #For standalone mode , specify localhost #for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2 atlas.graph.storage.hostname=localhost atlas.graph.storage.hbase.regions-per-server=1 atlas.graph.storage.lock.wait-time=10000 #Solr #atlas.graph.index.search.backend=solr # Solr cloud mode properties #atlas.graph.index.search.solr.mode=cloud #atlas.graph.index.search.solr.zookeeper-url=localhost:2181 #Solr http mode properties #atlas.graph.index.search.solr.mode=http #atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr # Graph Search Index #ElasticSearch atlas.graph.index.search.backend=elasticsearch atlas.graph.index.search.directory=${sys:atlas.home}/data/es atlas.graph.index.search.elasticsearch.client-only=false atlas.graph.index.search.elasticsearch.local-mode=true atlas.graph.index.search.elasticsearch.create.sleep=2000 ######### Notification Configs ######### atlas.notification.embedded=true atlas.kafka.data=${sys:atlas.home}/data/kafka atlas.kafka.zookeeper.connect=localhost:9026 atlas.kafka.bootstrap.servers=localhost:9027 atlas.kafka.zookeeper.session.timeout.ms=400 atlas.kafka.zookeeper.sync.time.ms=20 atlas.kafka.auto.commit.interval.ms=1000 atlas.kafka.auto.offset.reset=smallest atlas.kafka.hook.group.id=atlas ######### Hive Lineage Configs ######### # This models reflects the base super types for Data and Process #atlas.lineage.hive.table.type.name=DataSet #atlas.lineage.hive.process.type.name=Process #atlas.lineage.hive.process.inputs.name=inputs #atlas.lineage.hive.process.outputs.name=outputs ## Schema atlas.lineage.hive.table.schema.query.hive_table=hive_table where name='%s'\, columns atlas.lineage.hive.table.schema.query.Table=Table where name='%s'\, columns ## Server port configuration #atlas.server.http.port=21000 #atlas.server.https.port=21443 ######### Security Properties ######### # SSL config atlas.enableTLS=false #truststore.file=/path/to/truststore.jks #cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks #following only required for 2-way SSL #keystore.file=/path/to/keystore.jks # Authentication config # enabled: true or false atlas.http.authentication.enabled=false # type: simple or kerberos atlas.http.authentication.type=simple ######### Server Properties ######### atlas.rest.address=http://localhost:21000 # If enabled and set to true, this will run setup steps when the server starts #atlas.server.run.setup.on.start=false ######### Entity Audit Configs ######### atlas.audit.hbase.tablename=ATLAS_ENTITY_AUDIT_EVENTS atlas.audit.zookeeper.session.timeout.ms=1000 atlas.audit.hbase.zookeeper.quorum=localhost:2181 ######### High Availability Configuration ######## atlas.server.ha.enabled=false #### Enabled the configs below as per need if HA is enabled ##### #atlas.server.ids=id1 #atlas.server.address.id1=localhost:21000 #atlas.server.ha.zookeeper.connect=localhost:2181 #atlas.server.ha.zookeeper.retry.sleeptime.ms=1000 #atlas.server.ha.zookeeper.num.retries=3 #atlas.server.ha.zookeeper.session.timeout.ms=20000 ## if ACLs need to be set on the created nodes, uncomment these lines and set the values ## #atlas.server.ha.zookeeper.acl=<scheme>:<id> #atlas.server.ha.zookeeper.auth=<scheme>:<authinfo> #### atlas.login.method {FILE,LDAP,AD} #### atlas.login.method=FILE ### File path of users-credentials atlas.login.credentials.file=${sys:atlas.home}/conf/users-credentials.properties