Created on 06-08-2015 01:11 AM - edited 09-16-2022 02:30 AM
Hi there,
I've been following excerise one, I have run the sqoop command to import all tables into hive. It has imported the 'categories' table but it has not imported the other tables.
I'm using the quickstart VM hosted on GoGrid that includes the Tableu software.
Here is a full log:
[root@mb2d0-cldramaster-01 ~]# sqoop import-all-tables \ > -m 12 \ > --connect jdbc:mysql://216.121.94.146:3306/retail_db \ > --username=retail_dba \ > --password=cloudera \ > --compression-codec=snappy \ > --as-avrodatafile \ > --warehouse-dir=/user/hive/warehouse Warning: /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 15/06/08 00:37:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.2.0 15/06/08 00:37:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 15/06/08 00:37:30 INFO manager.SqlManager: Using default fetchSize of 1000 15/06/08 00:37:30 INFO tool.CodeGenTool: Beginning code generation 15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/06/08 00:37:30 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce Note: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.java uses or overrides a deprecated API. Note: Recompile with -Xlint:deprecation for details. 15/06/08 00:37:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.jar 15/06/08 00:37:32 WARN manager.MySQLManager: It looks like you are importing from mysql. 15/06/08 00:37:32 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 15/06/08 00:37:32 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 15/06/08 00:37:32 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 15/06/08 00:37:32 INFO mapreduce.ImportJobBase: Beginning import of categories 15/06/08 00:37:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 15/06/08 00:39:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1 15/06/08 00:39:07 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/sqoop_import_categories.avsc 15/06/08 00:39:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 15/06/08 00:39:07 INFO client.RMProxy: Connecting to ResourceManager at mb2d0-cldramaster-01/10.104.23.2:8032 15/06/08 00:39:09 INFO db.DBInputFormat: Using read commited transaction isolation 15/06/08 00:39:09 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories` 15/06/08 00:39:09 INFO mapreduce.JobSubmitter: number of splits:12 15/06/08 00:39:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433738552738_0001 15/06/08 00:39:10 INFO impl.YarnClientImpl: Submitted application application_1433738552738_0001 15/06/08 00:39:10 INFO mapreduce.Job: The url to track the job: http://mb2d0-cldramaster-01:8088/proxy/application_1433738552738_0001/ 15/06/08 00:39:10 INFO mapreduce.Job: Running job: job_1433738552738_0001 15/06/08 00:39:23 INFO mapreduce.Job: Job job_1433738552738_0001 running in uber mode : false 15/06/08 00:39:23 INFO mapreduce.Job: map 0% reduce 0% 15/06/08 00:39:38 INFO mapreduce.Job: map 25% reduce 0% 15/06/08 00:39:44 INFO mapreduce.Job: map 50% reduce 0% 15/06/08 00:39:49 INFO mapreduce.Job: map 75% reduce 0% 15/06/08 00:39:54 INFO mapreduce.Job: map 92% reduce 0% 15/06/08 00:39:59 INFO mapreduce.Job: map 100% reduce 0% 15/06/08 00:39:59 INFO mapreduce.Job: Job job_1433738552738_0001 completed successfully 15/06/08 00:41:02 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=1568938 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1414 HDFS: Number of bytes written=6868 HDFS: Number of read operations=48 HDFS: Number of large read operations=0 HDFS: Number of write operations=24 Job Counters Launched map tasks=12 Other local map tasks=12 Total time spent by all maps in occupied slots (ms)=205375 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=205375 Total vcore-seconds taken by all map tasks=205375 Total megabyte-seconds taken by all map tasks=210304000 Map-Reduce Framework Map input records=58 Map output records=58 Input split bytes=1414 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=585 CPU time spent (ms)=20150 Physical memory (bytes) snapshot=2704818176 Virtual memory (bytes) snapshot=18778705920 Total committed heap usage (bytes)=3667918848 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=6868 15/06/08 00:41:03 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 15/06/08 00:41:04 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 15/06/08 00:41:05 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 15/06/08 00:41:05 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 15/06/08 00:41:05 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.io.IOException: Job status not available You have new mail in /var/spool/mail/root [root@mb2d0-cldramaster-01 ~]# hadoop fs -ls /user/hive/warehouse Found 1 items drwxr-xr-x - root hive 0 2015-06-08 00:39 /user/hive/warehouse/categories
You can see when I run the last command, it only finds 1 item. In the tutorial, it should show 6 items.
Thanks so much for your help.
Regards,
Gaj
Created on 06-08-2015 05:44 AM - edited 06-08-2015 05:47 AM
Hi,
I managed to get past this. I deleted the old directories using this command here:
sudo -u hdfs hadoop fs -rm -r /user/hive/warehouse/\*
I then re-ran the import command and it seemed to work. I don't know why it did not work the first time.
Thanks,
Gaj
Created on 06-08-2015 05:44 AM - edited 06-08-2015 05:47 AM
Hi,
I managed to get past this. I deleted the old directories using this command here:
sudo -u hdfs hadoop fs -rm -r /user/hive/warehouse/\*
I then re-ran the import command and it seemed to work. I don't know why it did not work the first time.
Thanks,
Gaj