Support Questions
Find answers, ask questions, and share your expertise

Quickstart VM - Exercise 1 Sqoop import fails... maybe?

New Contributor

I'm working on the first import in the tutorial and when I run the following import:

[cloudera@quickstart ~]$ sqoop import-all-tables \
    -m 1 \
    --connect jdbc:mysql://quickstart:3306/retail_db \
    --username=retail_dba \
    --password=cloudera \
    --compression-codec=snappy \
    --as-parquetfile \
    --warehouse-dir=/user/hive/warehouse \
    --hive-import

I get the following error:

Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/07/12 08:34:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.16.2
19/07/12 08:34:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/07/12 08:34:51 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
19/07/12 08:34:51 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
19/07/12 08:34:51 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default
19/07/12 08:34:51 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is
19/07/12 08:34:51 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then
19/07/12 08:34:51 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing
19/07/12 08:34:51 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in
19/07/12 08:34:51 WARN tool.BaseSqoopTool: case that you will detect any issues.
19/07/12 08:34:51 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
19/07/12 08:34:51 INFO tool.CodeGenTool: Beginning code generation
19/07/12 08:34:51 INFO tool.CodeGenTool: Will generate java class as codegen_categories
19/07/12 08:34:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/07/12 08:34:51 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/07/12 08:34:51 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/b3174bb11ab99f147ebcad20da408f9a/codegen_categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
19/07/12 08:34:52 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/b3174bb11ab99f147ebcad20da408f9a/codegen_categories.jar
19/07/12 08:34:52 WARN manager.MySQLManager: It looks like you are importing from mysql.
19/07/12 08:34:52 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
19/07/12 08:34:52 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
19/07/12 08:34:52 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
19/07/12 08:34:52 INFO mapreduce.ImportJobBase: Beginning import of categories
19/07/12 08:34:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
19/07/12 08:34:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/07/12 08:34:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
19/07/12 08:34:54 INFO hive.metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083
19/07/12 08:34:54 INFO hive.metastore: Opened a connection to metastore, current connections: 1
19/07/12 08:34:54 INFO hive.metastore: Connected to metastore.
19/07/12 08:35:14 INFO hive.metastore: Closed a connection to metastore, current connections: 0
19/07/12 08:35:14 INFO hive.metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083
19/07/12 08:35:14 INFO hive.metastore: Opened a connection to metastore, current connections: 1
19/07/12 08:35:14 INFO hive.metastore: Connected to metastore.
19/07/12 08:35:34 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetOperationException: Hive MetaStore exception
org.kitesdk.data.DatasetOperationException: Hive MetaStore exception
	at org.kitesdk.data.spi.hive.MetaStoreUtil.tableExists(MetaStoreUtil.java:190)
	at org.kitesdk.data.spi.hive.MetaStoreUtil.exists(MetaStoreUtil.java:396)
	at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
	at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
	at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.exists(HiveAbstractMetadataProvider.java:159)
	at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.exists(FileSystemDatasetRepository.java:262)
	at org.kitesdk.data.Datasets.exists(Datasets.java:629)
	at org.kitesdk.data.Datasets.exists(Datasets.java:646)
	at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:117)
	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:267)
	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:691)
	at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
	at org.apache.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:105)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: MetaException(message:Exception thrown when executing query)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:37244)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:37221)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:37152)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1294)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1280)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.tableExists(HiveMetaStoreClient.java:1359)
	at org.kitesdk.data.spi.hive.MetaStoreUtil$2.call(MetaStoreUtil.java:181)
	at org.kitesdk.data.spi.hive.MetaStoreUtil$2.call(MetaStoreUtil.java:178)
	at org.kitesdk.data.spi.hive.MetaStoreUtil.doWithRetry(MetaStoreUtil.java:70)
	at org.kitesdk.data.spi.hive.MetaStoreUtil.tableExists(MetaStoreUtil.java:186)
	... 19 more

I've followed the advice of a few threads and restarted the services and VM multiple times, all are green in the Cloudera Manager. Interestingly, one thread I came across said to start the hive-metastore service and when I try, it fails immediately.

[cloudera@quickstart ~]$ sudo service hive-metastore start
Starting Hive Metastore (hive-metastore):                  [  OK  ]
[cloudera@quickstart ~]$ sudo service hive-metastore status
Hive Metastore is dead and pid file exists                 [FAILED]

However, I am able to query the db and it looks like this:

 

[cloudera@quickstart ~]$ sqoop list-tables --connect jdbc:mysql://quickstart.cloudera/retail_db --username retail_dba --password cloudera
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/07/12 08:46:09 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.16.2
19/07/12 08:46:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/07/12 08:46:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
categories
customers
departments
order_items
orders
products

Is this a change in the software and the expected results, does the tutorial need to be updated? The "/user/hive" directories are not being created so I'm at a stand still. I'm very confused about this and need some advice, thanks.

 

0 REPLIES 0