Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Exercise 1 error

avatar
Frequent Visitor

Hi, I got the following error when running:

sqoop import-all-tables     -m 12     --connect jdbc:mysql://quickstart.cloudera:3306/retail_db     --username=retail_dba     --password=cloudera     --compression-codec=snappy     --as-parquetfile     --warehouse-dir=/user/hive/warehouse     --hive-importhist

INFO hive.metastore: Connected to metastore.

15/07/24 14:49:16 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: default.categories

org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: default.categories

at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:51)

at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:77)

 

I guess the problem might be triggered by my previous running of the following command:

 

sqoop import-all-tables   -m 12   --connect jdbc:mysql://quickstart.cloudera:3306/retail_db   --username=retail_dba   --password=cloudera   --compression-codec=snappy   --as-avrodatafile   --warehouse-dir=/user/hive/warehouse

 

I have deleted /user/hive/warehouse, but it didn't help. 

Any hint how to solve this? Thanks!

 

Hadoop321

1 ACCEPTED SOLUTION

avatar
Guru

/user/hive/warehouse stores the data files, but the metadata (information about the structure and location of the data files) is managed by Hive. Connect to either Impala or Hive (you'll find instructions for doing so later in Tutorial Exercise 1, or Tutorial Exercise 2, depending on which version of the tutorial you're using. Once connected run 'show tables;', and you'll see a list of the tables it has metadata for. For each of these tables (assuming there isn't other data, unrelated to the tutorial that is already stored there), run 'drop table <table_name>;' When none of the tables from retail_db are shown when you run 'show tables;', the Sqoop job should be able to succeed.

View solution in original post

3 REPLIES 3

avatar
Guru

/user/hive/warehouse stores the data files, but the metadata (information about the structure and location of the data files) is managed by Hive. Connect to either Impala or Hive (you'll find instructions for doing so later in Tutorial Exercise 1, or Tutorial Exercise 2, depending on which version of the tutorial you're using. Once connected run 'show tables;', and you'll see a list of the tables it has metadata for. For each of these tables (assuming there isn't other data, unrelated to the tutorial that is already stored there), run 'drop table <table_name>;' When none of the tables from retail_db are shown when you run 'show tables;', the Sqoop job should be able to succeed.

avatar
Guru

According to the Sqoop documentation, the --hive-overwrite command should also allow you to do this without manually dropping the tables first, but I haven't tested that myself.

avatar
Frequent Visitor

it works perfectly. Thanks a lot for the tip!