Created on 09-23-2020 09:11 PM - last edited on 09-27-2020 10:12 PM by VidyaSargur
I am having issue with Sqoop, my Cloudera 6.3.1 install locally on my windows 10 computer with VM workstation. The issue is when i run Sqoop using Hue it is hanging on 50% and when i check through Yarn, it is showing as below
sqoop import --connect jdbc:mysql://xxx.xxx.xxx.xxx:3306/hive_test --table test --username test --P -m 1 --target-dir /user/hive/warehouse/hive_test/
Showing 4096 bytes.
jdbc:mysql://xxx.xxx.xxx.xxx:3306/hive_test --table test --username test --password ******** -m 1 --target-dir /user/hive/warehouse/hive_test/ Fetching child yarn jobs tag id : oozie-97e7d55f5648d0d51cc52269c0464e1a 23:53:20.138 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at node1/192.168.0.191:8032 No child applications found ================================================================= >>> Invoking Sqoop command line now >>> 23:53:20.584 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 23:53:20.647 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.7-cdh6.3.2 23:53:20.668 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead. 23:53:20.684 [main] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 23:53:20.727 [main] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset. 23:53:20.728 [main] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 23:54:24.267 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1 23:54:24.338 [main] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1 23:54:24.348 [main] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hadoop-mapreduce 23:54:28.326 [main] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/5ddd0257f3ea6a74ba13546270630642/test.jar 23:54:28.335 [main] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql. 23:54:28.335 [main] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct 23:54:28.335 [main] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path. 23:54:28.335 [main] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql) 23:54:28.343 [main] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test 23:54:28.361 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.jar is deprecated. Instead, use mapreduce.job.jar 23:54:28.409 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 23:54:28.411 [main] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies. 23:54:28.484 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at node1/192.168.0.191:8032 23:54:28.777 [main] INFO org.apache.hadoop.mapreduce.JobResourceUploader - Disabling Erasure Coding for path: /user/admin/.staging/job_1600918207170_0002 23:54:29.228 [main] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation 23:54:29.545 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 23:54:29.596 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1600918207170_0002 23:54:29.599 [main] INFO org.apache.hadoop.mapreduce.JobSubmitter - Executing with tokens: [Kind: YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 1 cluster_timestamp: 1600918207170 } attemptId: 1 } keyId: 2000384367)] 23:54:31.211 [main] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1600918207170_0002 23:54:31.324 [main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://node1:8088/proxy/application_1600918207170_0002/ 23:54:31.327 [main] INFO org.apache.hadoop.mapreduce.Job - Running job: job_1600918207170_0002
I try to change the YARN resource pool configuration, it is showing the same error also. But when i run sqoop using the same command line, it is successfully pull the data from mysql.
Please help, i really don't know where is the issue.
Created 09-24-2020 06:01 AM
@wenfeng Check that the user hue is able to write to the target directory. You may need to create a hue user directory with proper permissions if the hue user cannot write to the hive user directory.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
Steven
Created 09-26-2020 09:08 PM
Hi Steven,
I am able to upload file to /user/hive/warehouse/ods.db/hive_test/ or /user/hive/warehouse/ods.db/hive_test but Hue still unable to import the data. The same syntax was successfully run on linux prompt, so I really confuse why it doesn't work from Hue.
regards,
wenfeng
Created 10-26-2022 06:37 PM
Hello~
Now I am in the same problem as you.
Have you ever solved it?
I've been struggling with this problem for several days.