Created on 
    
	
		
		
		09-23-2020
	
		
		09:11 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 - last edited on 
    
	
		
		
		09-27-2020
	
		
		10:12 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
 by 
				
		
		
			VidyaSargur
		
		
		
		
		
		
		
		
	
			
		
I am having issue with Sqoop, my Cloudera 6.3.1 install locally on my windows 10 computer with VM workstation. The issue is when i run Sqoop using Hue it is hanging on 50% and when i check through Yarn, it is showing as below
sqoop import --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/hive_test --table test --username test --P -m 1 --target-dir /user/hive/warehouse/hive_test/
Showing 4096 bytes.
jdbc:mysql://xxx.xxx.xxx.xxx:3306/hive_test
             --table
             test
             --username
             test
             --password
             ********
             -m
             1
             --target-dir
             /user/hive/warehouse/hive_test/
Fetching child yarn jobs
tag id : oozie-97e7d55f5648d0d51cc52269c0464e1a
23:53:20.138 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at node1/192.168.0.191:8032
No child applications found
=================================================================
>>> Invoking Sqoop command line now >>>
23:53:20.584 [main] WARN  org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
23:53:20.647 [main] INFO  org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.7-cdh6.3.2
23:53:20.668 [main] WARN  org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
23:53:20.684 [main] WARN  org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
23:53:20.727 [main] INFO  org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset.
23:53:20.728 [main] INFO  org.apache.sqoop.tool.CodeGenTool - Beginning code generation
23:54:24.267 [main] INFO  org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
23:54:24.338 [main] INFO  org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
23:54:24.348 [main] INFO  org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoop-mapreduce
23:54:28.326 [main] INFO  org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/5ddd0257f3ea6a74ba13546270630642/test.jar
23:54:28.335 [main] WARN  org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql.
23:54:28.335 [main] WARN  org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct
23:54:28.335 [main] WARN  org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path.
23:54:28.335 [main] INFO  org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql)
23:54:28.343 [main] INFO  org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test
23:54:28.361 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.jar is deprecated. Instead, use mapreduce.job.jar
23:54:28.409 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
23:54:28.411 [main] WARN  org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
23:54:28.484 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at node1/192.168.0.191:8032
23:54:28.777 [main] INFO  org.apache.hadoop.mapreduce.JobResourceUploader - Disabling Erasure Coding for path: /user/admin/.staging/job_1600918207170_0002
23:54:29.228 [main] INFO  org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
23:54:29.545 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
23:54:29.596 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1600918207170_0002
23:54:29.599 [main] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Executing with tokens: [Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 1 cluster_timestamp: 1600918207170 } attemptId: 1 } keyId: 2000384367)]
23:54:31.211 [main] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1600918207170_0002
23:54:31.324 [main] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://node1:8088/proxy/application_1600918207170_0002/
23:54:31.327 [main] INFO  org.apache.hadoop.mapreduce.Job - Running job: job_1600918207170_0002
I try to change the YARN resource pool configuration, it is showing the same error also. But when i run sqoop using the same command line, it is successfully pull the data from mysql.
Please help, i really don't know where is the issue.
Created 09-24-2020 06:01 AM
@wenfeng Check that the user hue is able to write to the target directory. You may need to create a hue user directory with proper permissions if the hue user cannot write to the hive user directory.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
Steven
Created 09-26-2020 09:08 PM
Hi Steven,
I am able to upload file to /user/hive/warehouse/ods.db/hive_test/ or /user/hive/warehouse/ods.db/hive_test but Hue still unable to import the data. The same syntax was successfully run on linux prompt, so I really confuse why it doesn't work from Hue.
regards,
wenfeng
Created 10-26-2022 06:37 PM
Hello~
Now I am in the same problem as you.
Have you ever solved it?
I've been struggling with this problem for several days.
 
					
				
				
			
		
