28812
DISCUSSIONS
102238
MEMBERS
3162
ARTICLES
Created 01-19-2016 09:01 AM
I've got a problem. Trying to do the exercises to get to know Hadoop and Cloudera. But my VM is already failing at the first tast, when I try to execute:
$ sqoop import-all-tables \ -m 1 \ --connect jdbc:mysql://quickstart:3306/retail_db \ --username=retail_dba \ --password=cloudera \ --compression-codec=snappy \ --as-parquetfile \ --warehouse-dir=/user/hive/warehouse \ --hive-import
Then I'll get the following error message:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 16/01/19 08:54:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.0 16/01/19 08:54:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/01/19 08:54:17 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override 16/01/19 08:54:17 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 16/01/19 08:54:17 WARN tool.BaseSqoopTool: It seems that you're doing hive import directly into default 16/01/19 08:54:17 WARN tool.BaseSqoopTool: hive warehouse directory which is not supported. Sqoop is 16/01/19 08:54:17 WARN tool.BaseSqoopTool: firstly importing data into separate directory and then 16/01/19 08:54:17 WARN tool.BaseSqoopTool: inserting data into hive. Please consider removing 16/01/19 08:54:17 WARN tool.BaseSqoopTool: --target-dir or --warehouse-dir into /user/hive/warehouse in 16/01/19 08:54:17 WARN tool.BaseSqoopTool: case that you will detect any issues. 16/01/19 08:54:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/01/19 08:54:18 INFO tool.CodeGenTool: Beginning code generation 16/01/19 08:54:18 INFO tool.CodeGenTool: Will generate java class as codegen_categories 16/01/19 08:54:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/01/19 08:54:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/01/19 08:54:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce Note: /tmp/sqoop-cloudera/compile/6c71f3454000819b9873b7b398482ec4/codegen_categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 16/01/19 08:54:21 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/6c71f3454000819b9873b7b398482ec4/codegen_categories.jar 16/01/19 08:54:21 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/01/19 08:54:21 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/01/19 08:54:21 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/01/19 08:54:21 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/01/19 08:54:21 INFO mapreduce.ImportJobBase: Beginning import of categories 16/01/19 08:54:22 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/01/19 08:54:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/01/19 08:54:23 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 16/01/19 08:54:26 INFO hive.metastore: Trying to connect to metastore with URI thrift://quickstart.cloudera:9083 16/01/19 08:54:26 INFO hive.metastore: Opened a connection to metastore, current connections: 1 16/01/19 08:54:26 INFO hive.metastore: Connected to metastore. 16/01/19 08:54:26 WARN mapreduce.DataDrivenImportJob: Target Hive table 'categories' exists! Sqoop will append data into the existing Hive table. Consider using --hive-overwrite, if you do NOT intend to do appending. 16/01/19 08:54:29 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.DatasetIOException: Could not read schema org.kitesdk.data.DatasetIOException: Could not read schema at org.kitesdk.data.spi.hive.HiveUtils.descriptorForTable(HiveUtils.java:152) at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:104) at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:197) at org.kitesdk.data.Datasets.load(Datasets.java:108) at org.kitesdk.data.Datasets.load(Datasets.java:165) at org.kitesdk.data.Datasets.load(Datasets.java:187) at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:111) at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:130) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673) at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497) at org.apache.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:111) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) Caused by: java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/categories/.metadata/schemas/1.avsc at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1260) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1245) at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1233) at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:302) at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:268) at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:260) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1564) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:308) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:304) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:775) at org.kitesdk.data.spi.Schemas.open(Schemas.java:210) at org.kitesdk.data.spi.Schemas.fromAvsc(Schemas.java:71) at org.kitesdk.data.DatasetDescriptor$Builder.schemaUri(DatasetDescriptor.java:436) at org.kitesdk.data.spi.hive.HiveUtils.descriptorForTable(HiveUtils.java:150) ... 18 more Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /user/hive/warehouse/categories/.metadata/schemas/1.avsc at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1403) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:254) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1258) ... 33 more
I've searched the forums, but have not found anything.
Help would be very apprechiated.