Support Questions

Find answers, ask questions, and share your expertise

Hive Hook Error: NoClassDefFoundError: com/google/gson/GsonBuilder

avatar
Rising Star

I want to use Hive Hook to import metadata automatically. So I set-up the hive-site.xml and export HIVE_AUX_JARS_PATH, and copy the atlas-application.properties to the hive conf according the Atlas official guide: http://atlas.apache.org/Bridge-Hive.html.

But when I entered the Hive CLI, and typed "show tables;" or other commands. It showed that

NoClassDefFoundError: com/google/gson/GsonBuilder

I want to know how to solve it.

In my <atlas-conf>/atlas-application.properties, most of settings are default. I never change them. This file is shown as following:

#########  Graph Database Configs  #########
# Graph Storage
#atlas.graph.storage.backend=berkeleyje
#atlas.graph.storage.directory=${sys:atlas.home}/data/berkley

#Hbase as stoarge backend
atlas.graph.storage.backend=hbase
#For standalone mode , specify localhost
#for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2
atlas.graph.storage.hostname=localhost
atlas.graph.storage.hbase.regions-per-server=1
atlas.graph.storage.lock.wait-time=10000

#Solr
#atlas.graph.index.search.backend=solr

# Solr cloud mode properties
#atlas.graph.index.search.solr.mode=cloud
#atlas.graph.index.search.solr.zookeeper-url=localhost:2181

#Solr http mode properties
#atlas.graph.index.search.solr.mode=http
#atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr

# Graph Search Index
#ElasticSearch
atlas.graph.index.search.backend=elasticsearch
atlas.graph.index.search.directory=${sys:atlas.home}/data/es
atlas.graph.index.search.elasticsearch.client-only=false
atlas.graph.index.search.elasticsearch.local-mode=true
atlas.graph.index.search.elasticsearch.create.sleep=2000


#########  Notification Configs  #########
atlas.notification.embedded=true
atlas.kafka.data=${sys:atlas.home}/data/kafka
atlas.kafka.zookeeper.connect=localhost:9026
atlas.kafka.bootstrap.servers=localhost:9027
atlas.kafka.zookeeper.session.timeout.ms=400
atlas.kafka.zookeeper.sync.time.ms=20
atlas.kafka.auto.commit.interval.ms=1000
atlas.kafka.auto.offset.reset=smallest
atlas.kafka.hook.group.id=atlas


#########  Hive Lineage Configs  #########
# This models reflects the base super types for Data and Process
#atlas.lineage.hive.table.type.name=DataSet
#atlas.lineage.hive.process.type.name=Process
#atlas.lineage.hive.process.inputs.name=inputs
#atlas.lineage.hive.process.outputs.name=outputs

## Schema
atlas.lineage.hive.table.schema.query.hive_table=hive_table where name='%s'\, columns
atlas.lineage.hive.table.schema.query.Table=Table where name='%s'\, columns

## Server port configuration
#atlas.server.http.port=21000
#atlas.server.https.port=21443

#########  Security Properties  #########

# SSL config
atlas.enableTLS=false

#truststore.file=/path/to/truststore.jks
#cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks

#following only required for 2-way SSL
#keystore.file=/path/to/keystore.jks

# Authentication config

# enabled:  true or false
atlas.http.authentication.enabled=false
# type:  simple or kerberos
atlas.http.authentication.type=simple

#########  Server Properties  #########
atlas.rest.address=http://localhost:21000
# If enabled and set to true, this will run setup steps when the server starts
#atlas.server.run.setup.on.start=false

#########  Entity Audit Configs  #########
atlas.audit.hbase.tablename=ATLAS_ENTITY_AUDIT_EVENTS
atlas.audit.zookeeper.session.timeout.ms=1000
atlas.audit.hbase.zookeeper.quorum=localhost:2181

#########  High Availability Configuration ########
atlas.server.ha.enabled=false
#### Enabled the configs below as per need if HA is enabled #####
#atlas.server.ids=id1
#atlas.server.address.id1=localhost:21000
#atlas.server.ha.zookeeper.connect=localhost:2181
#atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
#atlas.server.ha.zookeeper.num.retries=3
#atlas.server.ha.zookeeper.session.timeout.ms=20000
## if ACLs need to be set on the created nodes, uncomment these lines and set the values ##
#atlas.server.ha.zookeeper.acl=<scheme>:<id>
#atlas.server.ha.zookeeper.auth=<scheme>:<authinfo>


#### atlas.login.method {FILE,LDAP,AD} ####
atlas.login.method=FILE

### File path of users-credentials
atlas.login.credentials.file=${sys:atlas.home}/conf/users-credentials.properties

At last, I noticed that, these are some settings shown in the official guide:

atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. default false
atlas.hook.hive.numRetries - number of retries for notification failure. default 3
atlas.hook.hive.minThreads - core number of threads. default 5
atlas.hook.hive.maxThreads - maximum number of threads. default 5
atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
atlas.hook.hive.queueSize - queue size for the threadpool. default 10000

Should I add these setting in to atlas-application.properties? And should I start the Hiveserver2 and the service of metastore of hive ?

1 ACCEPTED SOLUTION

avatar

@Ethan Hsieh

Issue seems to missing gson.jar in the AUX path. Please check and download the gson jar from the below link:

http://www.java2s.com/Code/Jar/g/Downloadgson222jar.htm

Hope this helps.

Thanks and Regards,

Sindhu

View solution in original post

8 REPLIES 8

avatar

@Ethan Hsieh

Issue seems to missing gson.jar in the AUX path. Please check and download the gson jar from the below link:

http://www.java2s.com/Code/Jar/g/Downloadgson222jar.htm

Hope this helps.

Thanks and Regards,

Sindhu

avatar
Rising Star

@Sindhu

Thank you very much.

After I import the metadata from Hive into Atlas, the Atlas Web UI showed that No lineage data was found.

In general, it should showed the lineage between Hive and Atlas.

Should I run the hiveserver2 or change the file:atlas-application.properties ?

avatar
@Ethan Hsieh

Have you run import-hive.sh? From where the tabled were created?

avatar
Rising Star

@Sindhu

The table were created in the Hive CLI, and I run import-hive.sh in the dir of $HIVE_HOME/bin.

But there is another more important question:

https://community.hortonworks.com/questions/41898/using-hive-hook-file-does-not-exist-atlas-client-0...

In the link above, the result showed a error:

 File does not exist: hdfs://localhost:9000/usr/local/data-governance/apache-atlas-0.7-incubating-SNAPSHOT/hook/hive/atlas-client-0.7-incubating-SNAPSHOT.jar

Could you help me to find the file: atlas-client-0.7-incubating-SNAPSHOT.jar ?

avatar
@Ethan Hsieh

I think 2.3.1 is latest version gsonlink

avatar
Rising Star

@Divakar Annapureddy

Thank you very much. After I download the jar of gson, it can works.

But I am wondering why this jar is missing ? Did this problem happen in the process of compiling the Atlas by using maven ?

And, after importing the metadata, the Atlas UI Web showed that there is no lineage data. Should I run the hiveserver2?

avatar

@Ethan Hsieh

Ideally you should find that jar already available in your HDP installation like the following location:

/usr/hdp/2.4.2.0-258/hadoop/lib/gson-2.2.4.jar 
/usr/hdp/2.4.2.0-258/hadoop/client/gson-2.2.4.jar

So it will be good to either use that jar as it ships with the HDP installation or check your classpath to see why that JAR is not added to your process.

avatar
Rising Star
@Joy

I haven't down the HDP. Because I want to learn how to configure and use Atlas lonely, I compile the Atlas 0.7 by using maven, and download the Hive.

So I am wondering , how to solve it other than download this jar file manually, or is the reason that the maven didn't include this jar ?