Created 06-27-2016 04:37 AM
I want to use Hive Hook to import metadata automatically. So I set-up the hive-site.xml and export HIVE_AUX_JARS_PATH, and copy the atlas-application.properties to the hive conf according the Atlas official guide: http://atlas.apache.org/Bridge-Hive.html.
But when I entered the Hive CLI, and typed "show tables;" or other commands. It showed that
NoClassDefFoundError: com/google/gson/GsonBuilder
I want to know how to solve it.
In my <atlas-conf>/atlas-application.properties, most of settings are default. I never change them. This file is shown as following:
######### Graph Database Configs ######### # Graph Storage #atlas.graph.storage.backend=berkeleyje #atlas.graph.storage.directory=${sys:atlas.home}/data/berkley #Hbase as stoarge backend atlas.graph.storage.backend=hbase #For standalone mode , specify localhost #for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2 atlas.graph.storage.hostname=localhost atlas.graph.storage.hbase.regions-per-server=1 atlas.graph.storage.lock.wait-time=10000 #Solr #atlas.graph.index.search.backend=solr # Solr cloud mode properties #atlas.graph.index.search.solr.mode=cloud #atlas.graph.index.search.solr.zookeeper-url=localhost:2181 #Solr http mode properties #atlas.graph.index.search.solr.mode=http #atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr # Graph Search Index #ElasticSearch atlas.graph.index.search.backend=elasticsearch atlas.graph.index.search.directory=${sys:atlas.home}/data/es atlas.graph.index.search.elasticsearch.client-only=false atlas.graph.index.search.elasticsearch.local-mode=true atlas.graph.index.search.elasticsearch.create.sleep=2000 ######### Notification Configs ######### atlas.notification.embedded=true atlas.kafka.data=${sys:atlas.home}/data/kafka atlas.kafka.zookeeper.connect=localhost:9026 atlas.kafka.bootstrap.servers=localhost:9027 atlas.kafka.zookeeper.session.timeout.ms=400 atlas.kafka.zookeeper.sync.time.ms=20 atlas.kafka.auto.commit.interval.ms=1000 atlas.kafka.auto.offset.reset=smallest atlas.kafka.hook.group.id=atlas ######### Hive Lineage Configs ######### # This models reflects the base super types for Data and Process #atlas.lineage.hive.table.type.name=DataSet #atlas.lineage.hive.process.type.name=Process #atlas.lineage.hive.process.inputs.name=inputs #atlas.lineage.hive.process.outputs.name=outputs ## Schema atlas.lineage.hive.table.schema.query.hive_table=hive_table where name='%s'\, columns atlas.lineage.hive.table.schema.query.Table=Table where name='%s'\, columns ## Server port configuration #atlas.server.http.port=21000 #atlas.server.https.port=21443 ######### Security Properties ######### # SSL config atlas.enableTLS=false #truststore.file=/path/to/truststore.jks #cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks #following only required for 2-way SSL #keystore.file=/path/to/keystore.jks # Authentication config # enabled: true or false atlas.http.authentication.enabled=false # type: simple or kerberos atlas.http.authentication.type=simple ######### Server Properties ######### atlas.rest.address=http://localhost:21000 # If enabled and set to true, this will run setup steps when the server starts #atlas.server.run.setup.on.start=false ######### Entity Audit Configs ######### atlas.audit.hbase.tablename=ATLAS_ENTITY_AUDIT_EVENTS atlas.audit.zookeeper.session.timeout.ms=1000 atlas.audit.hbase.zookeeper.quorum=localhost:2181 ######### High Availability Configuration ######## atlas.server.ha.enabled=false #### Enabled the configs below as per need if HA is enabled ##### #atlas.server.ids=id1 #atlas.server.address.id1=localhost:21000 #atlas.server.ha.zookeeper.connect=localhost:2181 #atlas.server.ha.zookeeper.retry.sleeptime.ms=1000 #atlas.server.ha.zookeeper.num.retries=3 #atlas.server.ha.zookeeper.session.timeout.ms=20000 ## if ACLs need to be set on the created nodes, uncomment these lines and set the values ## #atlas.server.ha.zookeeper.acl=<scheme>:<id> #atlas.server.ha.zookeeper.auth=<scheme>:<authinfo> #### atlas.login.method {FILE,LDAP,AD} #### atlas.login.method=FILE ### File path of users-credentials atlas.login.credentials.file=${sys:atlas.home}/conf/users-credentials.properties
At last, I noticed that, these are some settings shown in the official guide:
atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. default false atlas.hook.hive.numRetries - number of retries for notification failure. default 3 atlas.hook.hive.minThreads - core number of threads. default 5 atlas.hook.hive.maxThreads - maximum number of threads. default 5 atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10 atlas.hook.hive.queueSize - queue size for the threadpool. default 10000
Should I add these setting in to atlas-application.properties? And should I start the Hiveserver2 and the service of metastore of hive ?
Created 06-27-2016 04:44 AM
Issue seems to missing gson.jar in the AUX path. Please check and download the gson jar from the below link:
http://www.java2s.com/Code/Jar/g/Downloadgson222jar.htm
Hope this helps.
Thanks and Regards,
Sindhu
Created 06-27-2016 04:44 AM
Issue seems to missing gson.jar in the AUX path. Please check and download the gson jar from the below link:
http://www.java2s.com/Code/Jar/g/Downloadgson222jar.htm
Hope this helps.
Thanks and Regards,
Sindhu
Created 06-27-2016 10:57 AM
Created 06-27-2016 11:39 AM
Have you run import-hive.sh? From where the tabled were created?
Created 06-29-2016 05:13 AM
The table were created in the Hive CLI, and I run import-hive.sh in the dir of $HIVE_HOME/bin.
But there is another more important question:
In the link above, the result showed a error:
File does not exist: hdfs://localhost:9000/usr/local/data-governance/apache-atlas-0.7-incubating-SNAPSHOT/hook/hive/atlas-client-0.7-incubating-SNAPSHOT.jar
Could you help me to find the file: atlas-client-0.7-incubating-SNAPSHOT.jar ?
Created 06-27-2016 04:51 AM
I think 2.3.1 is latest version gsonlink
Created 06-27-2016 10:52 AM
Thank you very much. After I download the jar of gson, it can works.
But I am wondering why this jar is missing ? Did this problem happen in the process of compiling the Atlas by using maven ?
And, after importing the metadata, the Atlas UI Web showed that there is no lineage data. Should I run the hiveserver2?
Created 06-27-2016 06:36 AM
Ideally you should find that jar already available in your HDP installation like the following location:
/usr/hdp/2.4.2.0-258/hadoop/lib/gson-2.2.4.jar /usr/hdp/2.4.2.0-258/hadoop/client/gson-2.2.4.jar
So it will be good to either use that jar as it ships with the HDP installation or check your classpath to see why that JAR is not added to your process.
Created 06-27-2016 10:19 AM
I haven't down the HDP. Because I want to learn how to configure and use Atlas lonely, I compile the Atlas 0.7 by using maven, and download the Hive.
So I am wondering , how to solve it other than download this jar file manually, or is the reason that the maven didn't include this jar ?