Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Exercise 1 error

Solved Go to solution

Exercise 1 error

New Contributor

Hi, I got the following error when running:

sqoop import-all-tables     -m 12     --connect jdbc:mysql://quickstart.cloudera:3306/retail_db     --username=retail_dba     --password=cloudera     --compression-codec=snappy     --as-parquetfile     --warehouse-dir=/user/hive/warehouse     --hive-importhist

INFO hive.metastore: Connected to metastore.

15/07/24 14:49:16 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: default.categories

org.kitesdk.data.DatasetExistsException: Metadata already exists for dataset: default.categories

at org.kitesdk.data.spi.hive.HiveManagedMetadataProvider.create(HiveManagedMetadataProvider.java:51)

at org.kitesdk.data.spi.hive.HiveManagedDatasetRepository.create(HiveManagedDatasetRepository.java:77)

 

I guess the problem might be triggered by my previous running of the following command:

 

sqoop import-all-tables   -m 12   --connect jdbc:mysql://quickstart.cloudera:3306/retail_db   --username=retail_dba   --password=cloudera   --compression-codec=snappy   --as-avrodatafile   --warehouse-dir=/user/hive/warehouse

 

I have deleted /user/hive/warehouse, but it didn't help. 

Any hint how to solve this? Thanks!

 

Hadoop321

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Exercise 1 error

Master Collaborator

/user/hive/warehouse stores the data files, but the metadata (information about the structure and location of the data files) is managed by Hive. Connect to either Impala or Hive (you'll find instructions for doing so later in Tutorial Exercise 1, or Tutorial Exercise 2, depending on which version of the tutorial you're using. Once connected run 'show tables;', and you'll see a list of the tables it has metadata for. For each of these tables (assuming there isn't other data, unrelated to the tutorial that is already stored there), run 'drop table <table_name>;' When none of the tables from retail_db are shown when you run 'show tables;', the Sqoop job should be able to succeed.

3 REPLIES 3
Highlighted

Re: Exercise 1 error

Master Collaborator

/user/hive/warehouse stores the data files, but the metadata (information about the structure and location of the data files) is managed by Hive. Connect to either Impala or Hive (you'll find instructions for doing so later in Tutorial Exercise 1, or Tutorial Exercise 2, depending on which version of the tutorial you're using. Once connected run 'show tables;', and you'll see a list of the tables it has metadata for. For each of these tables (assuming there isn't other data, unrelated to the tutorial that is already stored there), run 'drop table <table_name>;' When none of the tables from retail_db are shown when you run 'show tables;', the Sqoop job should be able to succeed.

Re: Exercise 1 error

Master Collaborator

According to the Sqoop documentation, the --hive-overwrite command should also allow you to do this without manually dropping the tables first, but I haven't tested that myself.

Re: Exercise 1 error

New Contributor

it works perfectly. Thanks a lot for the tip!

Don't have an account?
Coming from Hortonworks? Activate your account here