About sim6

sim6 · ‎07-03-2016

Thanks. I feel a bit relaxed after hearing something on this.I am usign CM and I tried setting uber mode off. It still has the same problem. 😞 i haven't seen any of my merge jobs succeeding .I have tried it with different data from different databases as well but never did it work right. @yshi

sim6 · ‎07-02-2016

I have asked this question over and over and am starting to believe that I am missing out on something really basic here. Not a lot of people seem to have come across this and I am really stuck on this one: I get this error when I specify merge key argument with incremental import lastmodified in sqoop. If I run the job through command line, it works alright but not when I submit it to oozie. I submit my jobs through oozie. Not sure if oozie is the problem or hue, but sqoop job is not since it really works fine when executed through command line including the merge step. My sqoop job looks like this: sqoop job --meta-connect jdbc:hsqldb:hsql://FQDN:16000/sqoop --create test_table -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://IP/DB?zeroDateTimeBehavior=convertToNull --username USER_NAME --password 'PASSWORD' --table test_table --merge-key id -- split-by id --target-dir LOCATION --incremental lastmodified --last-value 0 --check-column updated_at The first import works alright .Starting second import I get: I created a small test table to test with an int, datetime and varchar , without any NULL or invalid chars in the data and yet I faced the same issue: # id, updated_at, name '1', '2016-07-02 17:16:53', 'l' '3', '2016-06-29 14:12:53', 'f' There were only 2 rows in the data and yet I got this: Error: java.lang.IllegalArgumentException at java.nio.ByteBuffer.allocate(ByteBuffer.java:330) at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Error: java.io.IOException: Illegal partition for 3 (-2) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Error: java.lang.IllegalArgumentException at java.nio.ByteBuffer.allocate(ByteBuffer.java:330) at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Error: java.io.IOException: Illegal partition for 1 (-2) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58) at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) I get this error only in OOZIE and submit the job through HUE and this works just fine including the Merge mapreduce when I run the sqoop job through command line Taken from oozie launcher, This is what my mapreduce job logs look like: >>> Invoking Sqoop command line now >>> 5373 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 5407 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.7.0 5702 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead. 5715 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 5740 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time. 5754 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000 5754 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 6091 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0 6098 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0 6118 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/hadoop-mapreduce 8173 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/454902ac78d49b783a1f51b7bfe0a2be/test_table.jar 8185 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0 8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Incremental import based on column updated_at 8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Lower bound value: '2016-07-02 17:13:24.0' 8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Upper bound value: '2016-07-02 17:16:56.0' 8194 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test_table 8214 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0 8230 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies. 8716 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation 8717 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat - BoundingValsQuery: SELECT MIN(id), MAX(id) FROM test_table WHERE ( updated_at >= '2016-07-02 17:13:24.0' AND updated_at < '2016-07-02 17:16:56.0' ) 8721 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.IntegerSplitter - Split size: 0; Num splits: 4 from: 1 to: 1 25461 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 26 bytes in 17.2192 seconds (1.5099 bytes/sec) 25471 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 1 records. 25536 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.ExportJobBase - IOException checking input file header: java.io.EOFException 25550 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies. Heart beat Heart beat 70628 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.ImportTool - Merge MapReduce job failed! 70628 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Saving incremental import state to the metastore 70831 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Updated data for job: test_table

sim6 · ‎05-21-2016

I have installed cloudera express version pn centos machine with cloudera manager. I have previously worked with oozie and have written the workflows manually without using any editor. Hue looks great but as of now I am looking forward to just deploy my workflow.xml/coordinator.xml and job.properteties file so that I can launch an oozie job from what I have already with me rather than designing it in the Hue. Previously I have had users created under /home directory on my centos machine. Though I have NOT installed cloudera as single user, I still do not see those different users created under home directory. and so I am not able to login as a different user with command like su oozie. Also, if I want to check the hadoop file system, I certainly can not do that from root user here using hadoop fs -ls command. What is the alternative for that? I am 1 day old with cloudera and find it much stabler as compared to rest of the distributions available so thank you for the great work. so 1. On running a sqoop job, I get java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver Which directory do I add the driver in? 2. How to add deploy already written workflows in oozie? 3. Why are users like hdfs, oozie, sqoop not created under /home directory? why I am not able to use command su hdfs and login as hdfs? I supposed Ideally these users should have been created?

sim6 · ‎05-14-2016

@Nilesh: I have the exact same problem. How did you solve this?

sim6 · ‎05-14-2016

@Neeraj Sabharwal: But it also gives this in the error before permission denied: I see this person had the same problem :https://community.hortonworks.com/questions/18261/tez-session-is-already-shutdown-failed-2-times-due.html . I would not really like to switch the engine to MR. - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt. 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt. 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------ 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------ 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) I have checked and TEZ is running. What could be the problem here? checked the yarn logs applicationId id and got this: ERROR org.apache.sqoop.tool.ImportTool - Encountered IOException running import job: java.io.IOException: Hive exited with status 1 at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:394) at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:344) at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:245) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228) at org.apache.sqoop.tool.JobTool.run(JobTool.java:283) at org.apache.sqoop.Sqoop.run(Sqoop.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:177) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:241) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) 2016-05-14 19:45:56,806 ERROR [main] tool.ImportTool (ImportTool.java:run(613)) - Encountered IOException running import job: java.io.IOException: Hive exited with status 1

sim6 · ‎05-14-2016

@Vinayak Agrawal: Mind sharing your solution? I have exactly the same problem

sim6 · ‎05-14-2016

Yes ofcourse. I am using ambari and it did create rest of the users indeep but not yarn.I created the user and assigned it to the hdfs group. Ifact checked and found that the permissions there look like: drwxrwxrwx - yarn hdfs 0 2016-05-14 19:14 /tmp/hive/yarn. I am wondering why is it storing it in /tmp/hive/yarn ? It should store it in /tmp/hive/oozie when I am running it through oozie.NO? I am still getting this: @Neeraj Sabharwal - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Exception in thread "main" java.lang.RuntimeException: org.apache.tez.dag.api.SessionNotRunning: TezSession has already shutdown. Application application_1463224371637_0088 failed 2 times due to AM Container for appattempt_1463224371637_0088_000002 exited with exitCode: -1000 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt. 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - For more detailed output, check application tracking page:http://warehouse.swtched.com:8088/cluster/app/application_1463224371637_0088Then, click on links to logs of each attempt. 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------ 2016-05-14 19:45:56,425 INFO [Thread-30] hive.HiveImport (LoggingAsyncSink.java:run(85)) - Diagnostics: Permission denied: user=oozie, access=EXECUTE, inode="/tmp/hive/yarn/_tez_session_dir/f1b2db7d-0836-4330-849c-dc3e4d6dc2d1/hive-hcatalog-core.jar":yarn:hdfs:drwx------ 41399 [Thread-30] INFO org.apache.sqoop.hive.HiveImport - at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)

sim6 · ‎05-14-2016

Also, I did run service checks and it completed successfully.

sim6 · ‎05-14-2016

@Neeraj Sabharwal :Thanks. Just realized that yarn user is missing from hdfs. I can create the user but would like to know what are usually the reasons that sometimes users aren't created properly. It has happened before as well.

sim6 · ‎05-14-2016

You're right. Not that I have skipped the reading part but I would appreciate if you could suggest a fairly straightforward source. Also, I see the problem is with permissions here. Could you please check the update?@Artem Ervits

Online	Offline
Last Visited	‎11-23-2018 10:57 AM

Member Since	‎05-19-2016 06:18 AM
Last Visited	‎11-23-2018 10:57 AM
Posts	216
Kudos received	20

Cloudera Community

Re: Sqoop import with warehouse-dir argument

Re: hive server 2 pause duration

Re: No databases are available. Permissions could ...

Re: oozie task with sqoop : java.lang.ClassNotFoun...

Re: IllegalArgumentException and Illegal partition...

IllegalArgumentException and Illegal partition for...

executing sqoop jobs in oozie

Re: Tez session is already shutdown failed 2 times...

Re: hive import fails from oozie

Re: Permission Denied for user while creating a hi...

Re: hive import fails from oozie

Re: hive import fails from oozie

Re: hive import fails from oozie

Re: hive import fails from oozie