Member since
05-16-2016
270
Posts
18
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1732 | 07-23-2016 11:36 AM | |
3084 | 07-23-2016 11:35 AM | |
1579 | 06-05-2016 10:41 AM | |
1168 | 06-05-2016 10:37 AM |
07-12-2016
01:44 PM
@bpreachuk Oh no no, That was a typo. I changed it to TO_DATE(SFO.created_at) <= '2016-05-31' and it still returns empty resultset
... View more
07-12-2016
04:59 AM
Though the IN operator in hive works just fine in this case as well but NOT IN doesn't. select DISTINCT SF.customer_email
FROM
Magento.sales_flat_order SF
WHERE
YEAR(TO_DATE(SF.created_at)) = '2016'
AND
MONTH(TO_DATE(SF.created_at)) = '6'
AND
SF.customer_email
NOT IN (
select SFO.customer_email FROM Magento.Sales_flat_order SFO
WHERE
TO_DATE(SFO.created_at) >= '2016-05-31'
) If I replace NOT IN with IN operator, that works. Inact using NOT IN with a list of strings specified works too but it somehow does not work with select statement. How do I get this working ?
... View more
Labels:
- Labels:
-
Apache Hive
07-07-2016
10:03 AM
I need to check for new emailId's added in 06-2016 that have not existed in database ever before that. I wrote the query using NOT IN operator: select DISTINCT SF.customer_email
FROM
Magento.sales_flat_order SF
WHERE
YEAR(TO_DATE(SF.created_at)) = '2016'
AND
MONTH(TO_DATE(SF.created_at)) = '6'
AND
SF.customer_email
NOT IN (
select SFO.customer_email FROM Magento.Sales_flat_order SFO
WHERE
TO_DATE(SFO.created_at) <= '2016-05-31'
) I have verified that there are new emaiid's in the time but it returns empty result-set. Why is that? Infact, When I replace NOT IN operator with IN operator, it does return me the common ones but somehow NOT IN is behaving erratically. Is there an alternative way I can do it?
... View more
Labels:
- Labels:
-
Apache Hive
07-07-2016
07:22 AM
I have a subquery like this: select SF.customer_email , SF.created_at
FROM
Table1 SF
WHERE
YEAR(TO_DATE(SF.created_at)) = '2016'
AND
MONTH(TO_DATE(SF.created_at)) = '6'
AND
SF.customer_email
NOT IN (
select SFO.customer_email FROM Table1 SFO
WHERE
TO_DATE(SFO.created_at) < '2016-05-31'
) I have checked manually and I should get results for the query but it returns empty resultset. Note: I am using same table in the subquery as well. Just a different condition on date column.
... View more
Labels:
- Labels:
-
Apache Hive
07-05-2016
05:14 AM
Hey, Thanks but it always worked through CLI and never did from oozie. Should I still create a new issue for that? @Predrag Minovic
... View more
07-02-2016
01:34 PM
@Predrag Minovic: Let me know if you find anything. It looks like a bug to me
... View more
07-02-2016
12:14 PM
Hey, just realized that it happens only when I submit job to oozie. It works fine when I run/execute sqoop jobs through the command line. I am submitting my jobs using hue. What could be the reason? @Predrag Minovic
... View more
07-02-2016
11:41 AM
@Predrag Minovic: Updated the question. Please have a look. Also, the problem arises only when I specify a merge key. Incremental append works alright but for incremental last modified, I do have to specify a merge key for the updated and that is exactly what isn't working .Also , if I run the sqoop job through command line, it works just fine. Also, please "note" that "I get this problem only while I run the job through oozie",
... View more
07-02-2016
11:11 AM
@Predrag Minovic: I did try dropping hive-drop-import-delims , also id is a primary key and an auto incrementing integer, I am just out of ideas what could really be the problem
... View more
07-02-2016
10:22 AM
I have asked this question over and over and am starting to believe that I am missing out on something really basic here. Not a lot of people seem to have come across this and I am really stuck on this one:
I get this error when I specify merge key argument with incremental import lastmodified .
sqoop job --meta-connect jdbc:hsqldb:hsql://FQDN:16000/sqoop --create JOB_NAME -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://IP/erp?zeroDateTimeBehavior=convertToNull --username USERNAME --password 'PASSWORD' --table TABLE_NAME --merge-key id --split-by id --incremental lastmodified --check-column update_date --last-value 0 --null-string '\\N' --null-non-string '\\N' --fields-terminated-by '\001' --hive-drop-import-delims
The first import works alright .Starting second import I get:
Error: java.io.IOException: Illegal partition for 1 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 67204 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 201610 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Why is it?What is really the solution? In this error line, 1 is a value in the column that is specified in --merge-key arguement. Illegal partition for 1 As suggested, I finally created a small test table to test with an int, datetime and varchar , without any NULL or invalid chars in the data and yet I faced the same issue: sqoop job --meta-connect jdbc:hsqldb:hsql://FQDN:16000/sqoop --create test_table -- import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://IP/DB?zeroDateTimeBehavior=convertToNull --username USER_NAME --password 'PASSWORD' --table test_table --merge-key id --split-by id --target-dir LOCATION --incremental lastmodified --last-value 0 --check-column updated_at There were only 2 rows in the data and yet I got this: Error: java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 3 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.lang.IllegalArgumentException
at java.nio.ByteBuffer.allocate(ByteBuffer.java:330)
at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:51)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1848)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1508)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:723)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Error: java.io.IOException: Illegal partition for 1 (-2)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1083)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:82)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
at org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Here's what my data looks like: # id, updated_at, name
'1', '2016-06-29 14:08:55', 'c'
'3', '2016-06-29 14:12:53', 'f' I missed out on a really important detail here that I get this error only in OOZIE and submit the job through HUE and this works just fine including the Merge mapreduce when I run the sqoop job through command line Taken from oozie launcher, This is what my mapreduce job logs look like: >>> Invoking Sqoop command line now >>>
5373 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5407 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.7.0
5702 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
5715 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5740 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
5754 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000
5754 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
6091 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
6098 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
6118 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/hadoop-mapreduce
8173 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/454902ac78d49b783a1f51b7bfe0a2be/test_table.jar
8185 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Incremental import based on column updated_at
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Lower bound value: '2016-07-02 17:13:24.0'
8192 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Upper bound value: '2016-07-02 17:16:56.0'
8194 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of test_table
8214 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM test_table AS t WHERE 1=0
8230 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
8716 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation
8717 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat - BoundingValsQuery: SELECT MIN(id), MAX(id) FROM test_table WHERE ( updated_at >= '2016-07-02 17:13:24.0' AND updated_at < '2016-07-02 17:16:56.0' )
8721 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.IntegerSplitter - Split size: 0; Num splits: 4 from: 1 to: 1
25461 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 26 bytes in 17.2192 seconds (1.5099 bytes/sec)
25471 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 1 records.
25536 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.ExportJobBase - IOException checking input file header: java.io.EOFException
25550 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies.
Heart beat
Heart beat
70628 [uber-SubtaskRunner] ERROR org.apache.sqoop.tool.ImportTool - Merge MapReduce job failed!
70628 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Saving incremental import state to the metastore
70831 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.ImportTool - Updated data for job: test_table
... View more
Labels:
- Labels:
-
Apache Sqoop