Member since
03-13-2015
7
Posts
1
Kudos Received
0
Solutions
03-13-2015
11:08 AM
Perhaps it is, but as I showed in the post above, Hive works correctly, but Impala fails. Thanks Morgan
... View more
03-13-2015
11:07 AM
ER, I am fairly new to this also. Started with the Virtualbox quickstart VM running on Windows host. FWIW, here is what I get when I run the same command... cloudera@quickstart morgan]$ sqoop import-all-tables \ > -m 1 \ > --connect jdbc:mysql://quickstart.cloudera:3306/retail_db \ > --username=retail_dba \ > --password=cloudera \ > --compression-codec=snappy \ > --as-avrodatafile \ > --warehouse-dir=/user/hive/warehouse Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 15/03/13 13:47:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.0 15/03/13 13:47:04 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 15/03/13 13:47:04 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 15/03/13 13:47:05 INFO tool.CodeGenTool: Beginning code generation 15/03/13 13:47:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/03/13 13:47:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/03/13 13:47:05 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce Note: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 15/03/13 13:47:09 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/categories.jar 15/03/13 13:47:09 WARN manager.MySQLManager: It looks like you are importing from mysql. 15/03/13 13:47:09 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 15/03/13 13:47:09 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 15/03/13 13:47:09 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 15/03/13 13:47:09 INFO mapreduce.ImportJobBase: Beginning import of categories 15/03/13 13:47:09 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address 15/03/13 13:47:09 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 15/03/13 13:47:11 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/03/13 13:47:12 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-cloudera/compile/034c37aed57826a53538f7603ccaa6c1/sqoop_import_categories.avsc 15/03/13 13:47:12 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 15/03/13 13:47:12 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/03/13 13:47:15 INFO db.DBInputFormat: Using read commited transaction isolation 15/03/13 13:47:15 INFO mapreduce.JobSubmitter: number of splits:1 15/03/13 13:47:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1425573450783_0059 15/03/13 13:47:16 INFO impl.YarnClientImpl: Submitted application application_1425573450783_0059 15/03/13 13:47:16 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1425573450783_0059/ 15/03/13 13:47:16 INFO mapreduce.Job: Running job: job_1425573450783_0059 15/03/13 13:47:29 INFO mapreduce.Job: Job job_1425573450783_0059 running in uber mode : false 15/03/13 13:47:29 INFO mapreduce.Job: map 0% reduce 0% 15/03/13 13:47:41 INFO mapreduce.Job: map 100% reduce 0% 15/03/13 13:47:41 INFO mapreduce.Job: Job job_1425573450783_0059 completed successfully 15/03/13 13:47:41 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=131709 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=1344 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=9535 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=9535 Total vcore-seconds taken by all map tasks=9535 Total megabyte-seconds taken by all map tasks=9763840 Map-Reduce Framework Map input records=58 Map output records=58 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=118 CPU time spent (ms)=1430 Physical memory (bytes) snapshot=118579200 Virtual memory (bytes) snapshot=856969216 Total committed heap usage (bytes)=60751872 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1344 15/03/13 13:47:41 INFO mapreduce.ImportJobBase: Transferred 1.3125 KB in 29.6161 seconds (45.3808 bytes/sec) 15/03/13 13:47:41 INFO mapreduce.ImportJobBase: Retrieved 58 records. I'm not sure why you are getting this error: Retrying connect to server: localhost/127.0.0.1:8021. In fact, in my VM, I don't have a listener on port 8021, but do have one on 8020. Maybe someone more knowledgeable can address that? Have you tried a restart of the VM? If you do that, give it some time for all the baclground processes to fire up before you try sqoop again Morgan
... View more
03-13-2015
08:16 AM
Perhaps see if you can connect directly to MySql database from the command line. Here is how it looks for me in the quickstart VM: [cloudera@quickstart ~]$ mysql --user=retail_dba --password=cloudera Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 26097 Server version: 5.1.66 Source distribution Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> use retail_db Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> show tables; +---------------------+ | Tables_in_retail_db | +---------------------+ | categories | | customers | | departments | | order_items | | orders | | products | +---------------------+ 6 rows in set (0.00 sec) mysql> If that fails, then perhaps MySQL is not running? Morgan
... View more
03-13-2015
08:06 AM
Is this a known issue with Impala? unix_timestamp is apparently not working properly with dates beyond the year 2038. In contrast, Hive appears to not have this problem. [cloudera@quickstart ~]$ impala-shell Starting Impala Shell without Kerberos authentication Connected to quickstart.cloudera:21000 Server version: impalad version 2.1.0-cdh5 RELEASE (build e48c2b48c53ea9601b8f47a39373aa83ff7ca6e2) Welcome to the Impala shell. Press TAB twice to see a list of available commands. Copyright (c) 2012 Cloudera, Inc. All rights reserved. (Shell build version: Impala Shell v2.1.0-cdh5 (e48c2b4) built on Tue Dec 16 19:00:35 PST 2014) [quickstart.cloudera:21000] > select to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a, > to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b, > to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c; Query: select to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a, to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b, to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c +------------+------------+------------+ | a | b | c | +------------+------------+------------+ | 2036-06-14 | 2037-06-14 | 1902-05-08 | +------------+------------+------------+ Fetched 1 row(s) in 0.41s [quickstart.cloudera:21000] > Here is the same query in Hive: [cloudera@quickstart ~]$ hive Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties hive> select to_date( from_unixtime( unix_timestamp( '2036-06-14', 'yyyy-MM-dd' ))) as a, > to_date( from_unixtime( unix_timestamp( '2037-06-14', 'yyyy-MM-dd' ))) as b, > to_date( from_unixtime( unix_timestamp( '2038-06-14', 'yyyy-MM-dd' ))) as c; Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1425573450783_0058, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1425573450783_0058/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1425573450783_0058 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0 2015-03-13 11:03:43,809 Stage-1 map = 0%, reduce = 0% 2015-03-13 11:03:55,265 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.71 sec MapReduce Total cumulative CPU time: 1 seconds 710 msec Ended Job = job_1425573450783_0058 MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Cumulative CPU: 1.71 sec HDFS Read: 284 HDFS Write: 33 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 710 msec OK 2036-06-14 2037-06-14 2038-06-14 Time taken: 29.0 seconds, Fetched: 1 row(s) Thanks Morgan
... View more
Labels: