Member since
05-19-2016
216
Posts
20
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4193 | 05-29-2018 11:56 PM | |
7031 | 07-06-2017 02:50 AM | |
3768 | 10-09-2016 12:51 AM | |
3541 | 05-13-2016 04:17 AM |
07-03-2016
10:09 PM
Thanks. I feel a bit relaxed after hearing something on this.I am usign CM and I tried setting uber mode off. It still has the same problem. 😞 i haven't seen any of my merge jobs succeeding .I have tried it with different data from different databases as well but never did it work right. @yshi
... View more
07-02-2016
05:40 AM
I have asked this question over and over and am starting to believe that I am missing out on something really basic here. Not a lot of people seem to have come across this and I am really stuck on this one: I get this error when I specify merge key argument with incremental import lastmodified in sqoop. If I run the job through command line, it works alright but not when I submit it to oozie. I submit my jobs through oozie. Not sure if oozie is the problem or hue, but sqoop job is not since it really works fine when executed through command line including the merge step. My sqoop job looks like this: sqoop job --meta-connect jdbc:hsqldb:hsql://FQDN:16000/sqoop
--create test_table -- import --driver com.mysql.jdbc.Driver --connect
jdbc:mysql://IP/DB?zeroDateTimeBehavior=convertToNull --username
USER_NAME --password 'PASSWORD' --table test_table --merge-key id --
split-by id --target-dir LOCATION --incremental lastmodified
--last-value 0 --check-column updated_at The first import works alright .Starting second import I get: I created a small test table to test with an int, datetime and varchar , without any NULL or invalid chars in the data and yet I faced the same issue: # id, updated_at, name
'1', '2016-07-02 17:16:53', 'l'
'3', '2016-06-29 14:12:53', 'f' There were only 2 rows in the data and yet I got this: Error: java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 3 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 1 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
I get this error only in OOZIE and submit the job through HUE and this works just fine including the Merge mapreduce when I run the sqoop job through command line Taken from oozie launcher, This is what my mapreduce job logs look like: >>> Invoking Sqoop command line now >>>
5373 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5407 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.7.0
5702 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
5715 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5740 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
5754 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000
5754 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
6091 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
6098 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
6118 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/hadoop-mapreduce
8173 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/454902ac78d49b783a1f51b7bfe0a2be/test_table.jar
8185 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Incremental import based on column updated_at
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Lower bound value: '2016-07-02 17:13:24.0'
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Upper bound value: '2016-07-02 17:16:56.0'
8194 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test_table
8214 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
8230 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
8716 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
8717 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat - BoundingValsQuery: SELECT MIN(id), MAX(id) FROM test_table WHERE ( updated_at >= '2016-07-02 17:13:24.0' AND updated_at < '2016-07-02 17:16:56.0' )
8721 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.IntegerSplitter - Split size: 0; Num splits: 4 from: 1 to: 1
25461 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 26 bytes in 17.2192 seconds (1.5099 bytes/sec)
25471 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 1 records.
25536 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.ExportJobBase - IOException checking input file header: java.io.EOFException
25550 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
Heart beat
Heart beat
70628 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.ImportTool - Merge MapReduce job failed!
70628 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Saving incremental import state to the metastore
70831 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Updated data for job: test_table
... View more
05-21-2016
03:45 AM
I have installed cloudera express version pn centos machine with cloudera manager. I have previously worked with oozie and have written the workflows manually without using any editor. Hue looks great but as of now I am looking forward to just deploy my workflow.xml/coordinator.xml and job.properteties file so that I can launch an oozie job from what I have already with me rather than designing it in the Hue. Previously I have had users created under /home directory on my centos machine. Though I have NOT installed cloudera as single user, I still do not see those different users created under home directory. and so I am not able to login as a different user with command like su oozie. Also, if I want to check the hadoop file system, I certainly can not do that from root user here using hadoop fs -ls command. What is the alternative for that? I am 1 day old with cloudera and find it much stabler as compared to rest of the distributions available so thank you for the great work. so 1. On running a sqoop job, I get java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver Which directory do I add the driver in? 2. How to add deploy already written workflows in oozie? 3. Why are users like hdfs, oozie, sqoop not created under /home directory? why I am not able to use command su hdfs and login as hdfs? I supposed Ideally these users should have been created?
... View more
Labels:
05-14-2016
02:25 PM
@Nilesh: I have the exact same problem. How did you solve this?
... View more
05-14-2016
02:18 PM
@Neeraj Sabharwal: But it also gives this in the error before permission denied: I see this person had the same problem :https://community.hortonworks.com/questions/18261/tez-session-is-already-shutdown-failed-2-times-due.html . I would not really like to switch the engine to MR. - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt.
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt.
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) I have checked and TEZ is running. What could be the problem here? checked the yarn logs applicationId id and got this: ERROR org.apache.sqoop.tool.ImportTool - Encountered IOException running import job: java.io.IOException: Hive exited with status 1
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:394)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:344)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:245)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
2016-05-14 19:45:56,806 ERROR [main] tool.ImportTool (ImportTool.java:run(613)) - Encountered IOException running import job: java.io.IOException: Hive exited with status 1
... View more
05-14-2016
01:54 PM
@Vinayak Agrawal: Mind sharing your solution? I have exactly the same problem
... View more
05-14-2016
01:51 PM
Yes ofcourse. I am using ambari and it did create rest of the users indeep but not yarn.I created the user and assigned it to the hdfs group. Ifact checked and found that the permissions there look like: drwxrwxrwx - yarn hdfs 0 2016-05-14 19:14 /tmp/hive/yarn. I am wondering why is it storing it in /tmp/hive/yarn ? It should store it in /tmp/hive/oozie when I am running it through oozie.NO? I am still getting this: @Neeraj Sabharwal - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt.
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt.
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------
2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------
41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
... View more
05-14-2016
12:42 PM
@Neeraj Sabharwal :Thanks. Just realized that yarn user is missing from hdfs. I can create the user but would like to know what are usually the reasons that sometimes users aren't created properly. It has happened before as well.
... View more
05-14-2016
12:38 PM
You're right. Not that I have skipped the reading part but I would appreciate if you could suggest a fairly straightforward source. Also, I see the problem is with permissions here. Could you please check the update?@Artem Ervits
... View more