Member since
06-08-2015
8
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3441 | 06-08-2015 05:44 AM |
06-08-2015
06:46 AM
The solution from Roberto worked brilliantly for me. Thanks, Gaj Ember
... View more
06-08-2015
05:44 AM
Hi, I managed to get past this. I deleted the old directories using this command here: sudo -u hdfs hadoop fs -rm -r /user/hive/warehouse/\* I then re-ran the import command and it seemed to work. I don't know why it did not work the first time. Thanks, Gaj Ember Software
... View more
06-08-2015
01:11 AM
Hi there, I've been following excerise one, I have run the sqoop command to import all tables into hive. It has imported the 'categories' table but it has not imported the other tables. I'm using the quickstart VM hosted on GoGrid that includes the Tableu software. Here is a full log: [root@mb2d0-cldramaster-01 ~]# sqoop import-all-tables \
> -m 12 \
> --connect jdbc:mysql://216.121.94.146:3306/retail_db \
> --username=retail_dba \
> --password=cloudera \
> --compression-codec=snappy \
> --as-avrodatafile \
> --warehouse-dir=/user/hive/warehouse
Warning: /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/06/08 00:37:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.2.0
15/06/08 00:37:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/06/08 00:37:30 INFO manager.SqlManager: Using default fetchSize of 1000
15/06/08 00:37:30 INFO tool.CodeGenTool: Beginning code generation
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/06/08 00:37:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.jar
15/06/08 00:37:32 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/06/08 00:37:32 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/06/08 00:37:32 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/06/08 00:37:32 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/06/08 00:37:32 INFO mapreduce.ImportJobBase: Beginning import of categories
15/06/08 00:37:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/06/08 00:39:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:39:07 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/sqoop_import_categories.avsc
15/06/08 00:39:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/06/08 00:39:07 INFO client.RMProxy: Connecting to ResourceManager at mb2d0-cldramaster-01/10.104.23.2:8032
15/06/08 00:39:09 INFO db.DBInputFormat: Using read commited transaction isolation
15/06/08 00:39:09 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories`
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: number of splits:12
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433738552738_0001
15/06/08 00:39:10 INFO impl.YarnClientImpl: Submitted application application_1433738552738_0001
15/06/08 00:39:10 INFO mapreduce.Job: The url to track the job: http://mb2d0-cldramaster-01:8088/proxy/application_1433738552738_0001/
15/06/08 00:39:10 INFO mapreduce.Job: Running job: job_1433738552738_0001
15/06/08 00:39:23 INFO mapreduce.Job: Job job_1433738552738_0001 running in uber mode : false
15/06/08 00:39:23 INFO mapreduce.Job: map 0% reduce 0%
15/06/08 00:39:38 INFO mapreduce.Job: map 25% reduce 0%
15/06/08 00:39:44 INFO mapreduce.Job: map 50% reduce 0%
15/06/08 00:39:49 INFO mapreduce.Job: map 75% reduce 0%
15/06/08 00:39:54 INFO mapreduce.Job: map 92% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job: map 100% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job: Job job_1433738552738_0001 completed successfully
15/06/08 00:41:02 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=1568938
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1414
HDFS: Number of bytes written=6868
HDFS: Number of read operations=48
HDFS: Number of large read operations=0
HDFS: Number of write operations=24
Job Counters
Launched map tasks=12
Other local map tasks=12
Total time spent by all maps in occupied slots (ms)=205375
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=205375
Total vcore-seconds taken by all map tasks=205375
Total megabyte-seconds taken by all map tasks=210304000
Map-Reduce Framework
Map input records=58
Map output records=58
Input split bytes=1414
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=585
CPU time spent (ms)=20150
Physical memory (bytes) snapshot=2704818176
Virtual memory (bytes) snapshot=18778705920
Total committed heap usage (bytes)=3667918848
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=6868
15/06/08 00:41:03 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:04 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
15/06/08 00:41:05 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.io.IOException: Job status not available
You have new mail in /var/spool/mail/root
[root@mb2d0-cldramaster-01 ~]# hadoop fs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x - root hive 0 2015-06-08 00:39 /user/hive/warehouse/categories You can see when I run the last command, it only finds 1 item. In the tutorial, it should show 6 items. Thanks so much for your help. Regards, Gaj
... View more