Support Questions

Find answers, ask questions, and share your expertise

Error with "Make sure HIVE_CONF_DIR is set correctly"

avatar
Rising Star

Hi Cloudera community! Happy to join your community! I'm a sysadmin and who love my job and like to works on new technology. So, I'm on Cloudera now! For some test, we create a cluster with 3 nodes in a labs. 1 node for Cloudera Manager, 1 node for NameNode and DataNode, and the last one as DataNode only. It's a labs to discover the new version of Cloudera 5.5. So it's just to made some test on it, not to be in production! We install these services: hdfs, hive, hue, impala, oozie, zookeeper, Mapreduce2 (Yarn), Sqoop1. One our developers, try to import some data into Hive, but we got an error. Here the command line use by our developers: sqoop import --connect jdbc:mysql://our.database.url/database --username user --password passwordtest --table table_product --target-dir /path/to/db --split-by product_id --hive-import --hive-overwrite --hive-table table_product The command start successfully, we see the mapper do the job to 100 but when the job finish, we have an error: 6/02/12 15:37:57 WARN hive.TableDefWriter: Column last_updated had to be cast to a less precise type in Hive 16/02/12 15:37:57 INFO hive.HiveImport: Loading uploaded data into Hive 16/02/12 15:37:57 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly. 16/02/12 15:37:57 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50) at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392) at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379) at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337) .... I do some search in configuration file about the HIVE_CONF_DIR and doesn't find something weird. I don't find a solution about and I block on it... So our developer can't continue his test. I search in Cloudera Manager configuration too. Have you an idea about that ? I do some search on web with no success. Thanks a lot for your help!

1 ACCEPTED SOLUTION

avatar
Rising Star

It works! 

 

As we see in the outpulog, we see the HADOOP_CLASSPATH variable. Or we don't have any path for libs in hive directory...

 

I try once to add in HADOOP_CLASSPATH the his folder but it doesn't works.

 

The solution is to add the the folder and /* to take all jar...

 

So I add this one in .bash_profile:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/lib/hive/lib/*

 

Then

source ~/.bash_profile

And now it works. Date were imported in Hive! 

 

Now we can continue our labs with Cloudera 5! 

 

Thanks!

 

 

View solution in original post

8 REPLIES 8

avatar
Explorer

Did you try passing the --config 

sorry to tell you , please makesure you have hive-site.xml file  localted inside the hive conf directory. 

 

 

Could you please put your command line in the code section like the below .

you will find an icon like this in the edit windown {i} - click on em and put your code in the pop up window. 

 

thanks

 

Put your command line here for readability 

 

avatar
Rising Star

Hello MattSun,

 

Thanks for your help !

 

For this command, I talk with my dev to try the --config option

sqoop import --connect jdbc:mysql://our.database.url/database --username user --password passwordtest --table table_product --target-dir /path/to/db --split-by product_id --hive-import --hive-overwrite --hive-table table_product

But the --config option is not available.

 

I check on the Sqoop documentation website and find nothing about this otption

https://sqoop.apache.org/docs/1.4.6/

 

The /etc/hive/conf/hive-site.xml is present. The hive-env.sh too.

 

To add more information about my clutser, here the differents version of our installed tools:

Parquet	     1.5.0+cdh5.5.1+176
Impala	     2.3.0+cdh5.5.1+0
YARN	     2.6.0+cdh5.5.1+924
spark	     1.5.0+cdh5.5.1+94
HDFS	     2.6.0+cdh5.5.1+924
hue-common   3.9.0+cdh5.5.1+333
hadoop-kms   2.6.0+cdh5.5.1+924
Sqoop	     1.4.6+cdh5.5.1+29
Oozie	     4.1.0+cdh5.5.1+223
Zookeeper    3.4.5+cdh5.5.1+91
Hue	     3.9.0+cdh5.5.1+333
MapReduce 1  2.6.0+cdh5.5.1+924
Hadoop	     2.6.0+cdh5.5.1+924
Hive	     1.1.0+cdh5.5.1+327
HCatalog     1.1.0+cdh5.5.1+327
MapReduce2   2.6.0+cdh5.5.1+924
Java 6	     JAVA_HOME=/usr/java/jdk1.6.0_31 java version "1.6.0_31" Java(TM) SE Runtime Environment (build 1.6.0_31-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
Java 7	     JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

Thanks for your help.

avatar
Rising Star

I try to stop all service in the cluster and restart them.

I use this documentation for the order to start the process

Cloudera 5 documentation for stop/start order

avatar
Rising Star

To reproduce the problem, I install hadoop-client and sqoop on my machine.

 

Same error here...

The job start, data import is done successfully on HDFS, (I can see on Hue the job status and the database is hdfs),

16/02/18 17:01:15 INFO mapreduce.Job: Running job: job_1455812803225_0020
16/02/18 17:01:24 INFO mapreduce.Job: Job job_1455812803225_0020 running in uber mode : false
16/02/18 17:01:24 INFO mapreduce.Job:  map 0% reduce 0%
16/02/18 17:01:33 INFO mapreduce.Job:  map 25% reduce 0%
16/02/18 17:01:34 INFO mapreduce.Job:  map 50% reduce 0%
16/02/18 17:01:41 INFO mapreduce.Job:  map 100% reduce 0%
16/02/18 17:01:41 INFO mapreduce.Job: Job job_1455812803225_0020 completed successfully
16/02/18 17:01:41 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=555640
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=473
                HDFS: Number of bytes written=8432
                HDFS: Number of read operations=16
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters 
                Launched map tasks=4
                Other local map tasks=4
                Total time spent by all maps in occupied slots (ms)=25664
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=25664
                Total vcore-seconds taken by all map tasks=25664
                Total megabyte-seconds taken by all map tasks=26279936
        Map-Reduce Framework
                Map input records=91
                Map output records=91
                Input split bytes=473
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=351
                CPU time spent (ms)=4830
                Physical memory (bytes) snapshot=802369536
                Virtual memory (bytes) snapshot=6319828992
                Total committed heap usage (bytes)=887095296
        File Input Format Counters 
                Bytes Read=0
        File Output Format Counters 
                Bytes Written=8432
16/02/18 17:01:41 INFO mapreduce.ImportJobBase: Transferred 8,2344 KB in 30,7491 seconds (274,219 bytes/sec)
16/02/18 17:01:41 INFO mapreduce.ImportJobBase: Retrieved 91 records.

but when import with hive is start:

 

16/02/18 17:01:41 WARN hive.TableDefWriter: Column last_updated had to be cast to a less precise type in Hive
16/02/18 17:01:41 INFO hive.HiveImport: Loading uploaded data into Hive
16/02/18 17:01:41 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
16/02/18 17:01:41 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
        at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
        at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
        at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
        at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:195)
        at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
        ... 12 more

I try few thing about that:

 

- Add variable HIVE_CONF_DIR=/etc/hive/conf to my .bash_profile file: No success

- Add the same variable to /usr/lib/hive/conf/hive-env.sh: no success

- Copy /usr/lib/sqoop/conf/sqoop-env-template.sh and add the variable inside: no success

 

Hope somebody have an idea to help us!

 

 

avatar
Explorer

I am not getting it , 

.ClassNotFoundException:

sounds  more like a missing jar  to me . 

I use the below version , it works fine .

 

Sqoop 1.4.4-cdh5.0.0
Hive 0.12.0-cdh5.0.0

 

wil dig more and let you know if I come up with anything.sorry 

avatar
Rising Star

Hi Matt,

 

Thanks. But my problem still present...

 

Maybe someone else with a fresh install of Cloudera Manager/CDH 5.5 can have the same problem.

 

For test, I try a fresh install on an another single machine. Same error ! 

So maybe the problem came from the client configuration.

 

To do the installation I use our Cloudera Manager/CDH repository which are sync every day.

So I use the package and the not the parcels during installation. 

 

My test VM are on CentOS 6.6. A supported version. 

 

To start the command, I launch it since my machine (Ubuntu)

I install these services to work:

sudo apt-get install hadoop-client hive oozie-client sqoop

I add these variable in my ".bash_profile"

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/share/java/slf4j-simple.jar
export HIVE_HOME=/usr/lib/hive
export PATH=$PATH:$HIVE_HOME/bin

Then I do an "scp" to recover the "/etc/sqoop", "/etc/hive", "/etc/hadoop/", 

The configuration seems to be ok so. If not, the command can't start.

 

I tried to add the the HIVE_CONF_DIR variable in differents file:

- sqoop-env.sh

- hadoop-env.sh

- hive-env.sh

 

Without any success. The process starts, but the error still present.

 

Hope somebody can help me !

 

avatar
Rising Star

On my client machine, I try to find the class and find it in a jar file.

 

I go to /usr/lib/hive/lib folder, and look inside the hive-common.jar with this command:

 

jar tf hive-common.jar

At the end, I can see this line:

 

 

org/apache/hadoop/hive/conf/HiveConf.class

So the class is present. So why he can't find it when it start the import ?

 

The HIVE_HOME is set to /usr/lib/hive, so, the path is valid...

 

I continue to search, but maybe it can give you more informations why and how to solve that !

avatar
Rising Star

It works! 

 

As we see in the outpulog, we see the HADOOP_CLASSPATH variable. Or we don't have any path for libs in hive directory...

 

I try once to add in HADOOP_CLASSPATH the his folder but it doesn't works.

 

The solution is to add the the folder and /* to take all jar...

 

So I add this one in .bash_profile:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/usr/lib/hive/lib/*

 

Then

source ~/.bash_profile

And now it works. Date were imported in Hive! 

 

Now we can continue our labs with Cloudera 5! 

 

Thanks!