<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Tutorial exercise 1: Problem ingesting structured data using sqoop in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Tutorial-exercise-1-Problem-ingesting-structured-data-using/m-p/28279#M6187</link>
    <description>&lt;P&gt;Hi there,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've been following excerise one, I have run the sqoop command to import all tables into hive. &amp;nbsp;It has imported the 'categories' table but it has not imported the other tables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using the quickstart VM hosted on GoGrid that includes the Tableu software.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is a full log:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;[root@mb2d0-cldramaster-01 ~]# sqoop import-all-tables \
&amp;gt;     -m 12 \
&amp;gt;     --connect jdbc:mysql://216.121.94.146:3306/retail_db \
&amp;gt;     --username=retail_dba \
&amp;gt;     --password=cloudera \
&amp;gt;     --compression-codec=snappy \
&amp;gt;     --as-avrodatafile \
&amp;gt;     --warehouse-dir=/user/hive/warehouse
Warning: /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/06/08 00:37:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.2.0
15/06/08 00:37:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/06/08 00:37:30 INFO manager.SqlManager: Using default fetchSize of 1000
15/06/08 00:37:30 INFO tool.CodeGenTool: Beginning code generation
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/06/08 00:37:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.jar
15/06/08 00:37:32 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/06/08 00:37:32 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/06/08 00:37:32 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/06/08 00:37:32 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/06/08 00:37:32 INFO mapreduce.ImportJobBase: Beginning import of categories
15/06/08 00:37:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/06/08 00:39:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:39:07 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/sqoop_import_categories.avsc
15/06/08 00:39:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/06/08 00:39:07 INFO client.RMProxy: Connecting to ResourceManager at mb2d0-cldramaster-01/10.104.23.2:8032
15/06/08 00:39:09 INFO db.DBInputFormat: Using read commited transaction isolation
15/06/08 00:39:09 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories`
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: number of splits:12
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433738552738_0001
15/06/08 00:39:10 INFO impl.YarnClientImpl: Submitted application application_1433738552738_0001
15/06/08 00:39:10 INFO mapreduce.Job: The url to track the job: http://mb2d0-cldramaster-01:8088/proxy/application_1433738552738_0001/
15/06/08 00:39:10 INFO mapreduce.Job: Running job: job_1433738552738_0001
15/06/08 00:39:23 INFO mapreduce.Job: Job job_1433738552738_0001 running in uber mode : false
15/06/08 00:39:23 INFO mapreduce.Job:  map 0% reduce 0%
15/06/08 00:39:38 INFO mapreduce.Job:  map 25% reduce 0%
15/06/08 00:39:44 INFO mapreduce.Job:  map 50% reduce 0%
15/06/08 00:39:49 INFO mapreduce.Job:  map 75% reduce 0%
15/06/08 00:39:54 INFO mapreduce.Job:  map 92% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job:  map 100% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job: Job job_1433738552738_0001 completed successfully
15/06/08 00:41:02 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=1568938
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=1414
		HDFS: Number of bytes written=6868
		HDFS: Number of read operations=48
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=24
	Job Counters 
		Launched map tasks=12
		Other local map tasks=12
		Total time spent by all maps in occupied slots (ms)=205375
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=205375
		Total vcore-seconds taken by all map tasks=205375
		Total megabyte-seconds taken by all map tasks=210304000
	Map-Reduce Framework
		Map input records=58
		Map output records=58
		Input split bytes=1414
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=585
		CPU time spent (ms)=20150
		Physical memory (bytes) snapshot=2704818176
		Virtual memory (bytes) snapshot=18778705920
		Total committed heap usage (bytes)=3667918848
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=6868
15/06/08 00:41:03 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:04 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
15/06/08 00:41:05 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.io.IOException: Job status not available 
You have new mail in /var/spool/mail/root
[root@mb2d0-cldramaster-01 ~]# hadoop fs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x   - root hive          0 2015-06-08 00:39 /user/hive/warehouse/categories&lt;/PRE&gt;&lt;P&gt;You can see when I run the last command, it only finds 1 item. &amp;nbsp;In the tutorial, it should show 6 items.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks so much for your help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Gaj&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 09:30:57 GMT</pubDate>
    <dc:creator>Gaj</dc:creator>
    <dc:date>2022-09-16T09:30:57Z</dc:date>
    <item>
      <title>Tutorial exercise 1: Problem ingesting structured data using sqoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Tutorial-exercise-1-Problem-ingesting-structured-data-using/m-p/28279#M6187</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've been following excerise one, I have run the sqoop command to import all tables into hive. &amp;nbsp;It has imported the 'categories' table but it has not imported the other tables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using the quickstart VM hosted on GoGrid that includes the Tableu software.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is a full log:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;[root@mb2d0-cldramaster-01 ~]# sqoop import-all-tables \
&amp;gt;     -m 12 \
&amp;gt;     --connect jdbc:mysql://216.121.94.146:3306/retail_db \
&amp;gt;     --username=retail_dba \
&amp;gt;     --password=cloudera \
&amp;gt;     --compression-codec=snappy \
&amp;gt;     --as-avrodatafile \
&amp;gt;     --warehouse-dir=/user/hive/warehouse
Warning: /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
15/06/08 00:37:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.2.0
15/06/08 00:37:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/06/08 00:37:30 INFO manager.SqlManager: Using default fetchSize of 1000
15/06/08 00:37:30 INFO tool.CodeGenTool: Beginning code generation
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:37:30 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
15/06/08 00:37:32 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/categories.jar
15/06/08 00:37:32 WARN manager.MySQLManager: It looks like you are importing from mysql.
15/06/08 00:37:32 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
15/06/08 00:37:32 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
15/06/08 00:37:32 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
15/06/08 00:37:32 INFO mapreduce.ImportJobBase: Beginning import of categories
15/06/08 00:37:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
15/06/08 00:39:06 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `categories` AS t LIMIT 1
15/06/08 00:39:07 INFO mapreduce.DataDrivenImportJob: Writing Avro schema file: /tmp/sqoop-root/compile/6f9632f206ce58bd0d42187391fced45/sqoop_import_categories.avsc
15/06/08 00:39:07 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
15/06/08 00:39:07 INFO client.RMProxy: Connecting to ResourceManager at mb2d0-cldramaster-01/10.104.23.2:8032
15/06/08 00:39:09 INFO db.DBInputFormat: Using read commited transaction isolation
15/06/08 00:39:09 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`category_id`), MAX(`category_id`) FROM `categories`
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: number of splits:12
15/06/08 00:39:09 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1433738552738_0001
15/06/08 00:39:10 INFO impl.YarnClientImpl: Submitted application application_1433738552738_0001
15/06/08 00:39:10 INFO mapreduce.Job: The url to track the job: http://mb2d0-cldramaster-01:8088/proxy/application_1433738552738_0001/
15/06/08 00:39:10 INFO mapreduce.Job: Running job: job_1433738552738_0001
15/06/08 00:39:23 INFO mapreduce.Job: Job job_1433738552738_0001 running in uber mode : false
15/06/08 00:39:23 INFO mapreduce.Job:  map 0% reduce 0%
15/06/08 00:39:38 INFO mapreduce.Job:  map 25% reduce 0%
15/06/08 00:39:44 INFO mapreduce.Job:  map 50% reduce 0%
15/06/08 00:39:49 INFO mapreduce.Job:  map 75% reduce 0%
15/06/08 00:39:54 INFO mapreduce.Job:  map 92% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job:  map 100% reduce 0%
15/06/08 00:39:59 INFO mapreduce.Job: Job job_1433738552738_0001 completed successfully
15/06/08 00:41:02 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=1568938
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=1414
		HDFS: Number of bytes written=6868
		HDFS: Number of read operations=48
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=24
	Job Counters 
		Launched map tasks=12
		Other local map tasks=12
		Total time spent by all maps in occupied slots (ms)=205375
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=205375
		Total vcore-seconds taken by all map tasks=205375
		Total megabyte-seconds taken by all map tasks=210304000
	Map-Reduce Framework
		Map input records=58
		Map output records=58
		Input split bytes=1414
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=585
		CPU time spent (ms)=20150
		Physical memory (bytes) snapshot=2704818176
		Virtual memory (bytes) snapshot=18778705920
		Total committed heap usage (bytes)=3667918848
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=6868
15/06/08 00:41:03 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:04 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO ipc.Client: Retrying connect to server: mb2d0-cldraagent-01/10.104.23.3:55949. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
15/06/08 00:41:05 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
15/06/08 00:41:05 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: java.io.IOException: Job status not available 
You have new mail in /var/spool/mail/root
[root@mb2d0-cldramaster-01 ~]# hadoop fs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x   - root hive          0 2015-06-08 00:39 /user/hive/warehouse/categories&lt;/PRE&gt;&lt;P&gt;You can see when I run the last command, it only finds 1 item. &amp;nbsp;In the tutorial, it should show 6 items.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks so much for your help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Gaj&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:30:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Tutorial-exercise-1-Problem-ingesting-structured-data-using/m-p/28279#M6187</guid>
      <dc:creator>Gaj</dc:creator>
      <dc:date>2022-09-16T09:30:57Z</dc:date>
    </item>
    <item>
      <title>Re: Tutorial exercise 1: Problem ingesting structured data using sqoop</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Tutorial-exercise-1-Problem-ingesting-structured-data-using/m-p/28285#M6188</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I managed to get past this. &amp;nbsp;I deleted the old directories using this command here:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sudo -u hdfs hadoop fs -rm -r /user/hive/warehouse/\*&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I then re-ran the import command and it seemed to work. &amp;nbsp;I don't know why it did not work the first time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Gaj&lt;/P&gt;&lt;P&gt;&lt;A href="http://embersoftware.com.au/" target="_blank"&gt;Ember Software&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 08 Jun 2015 12:47:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Tutorial-exercise-1-Problem-ingesting-structured-data-using/m-p/28285#M6188</guid>
      <dc:creator>Gaj</dc:creator>
      <dc:date>2015-06-08T12:47:10Z</dc:date>
    </item>
  </channel>
</rss>

