This error is happening for the big tables where the map join is not possible and the optimizer is choosing the merge join. To mimic it I have given the below steps and I am using Hive version 1.2.1. This issue is only happening for the "tez" session and not producible in "mr". set hive.execution.engine=tez; create table default.asim_test1 (col1 STRING, col2 string, col3 string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' ; insert into default.asim_test1 values('105','test','2017-01-01 12:30:30'); --- As I am forcing the optimizer to choose the merge join I am executing below command I dont need to in real case as my table is very big to fit in memory and optimizer is choosing merge join instead of map join. set hive.auto.convert.join=false; ---Below is the query which is erroring out. select A.col1,A.col2, A.lastmodifiedtimestamp current_modified, B.col2 , B.lastmodifiedtimestamp hist_modified from ( select * from (select col1,col2, FROM_UTC_TIMESTAMP(CAST(from_unixtime(unix_timestamp(regexp_replace(substr(col3,1,19),'T',' '))) AS TIMESTAMP),'CST') lastmodifiedtimestamp, ROW_NUMBER()over(PARTITION BY col1 ORDER BY col3 desc ) rnm from default.asim_test1 )A WHERE rnm=1 ) A INNER JOIN ( select * from (select col1,col2, FROM_UTC_TIMESTAMP(CAST(from_unixtime(unix_timestamp(regexp_replace(substr(col3,1,19),'T',' '))) AS TIMESTAMP),'CST') lastmodifiedtimestamp, ROW_NUMBER()over(PARTITION BY col1 ORDER BY col3 desc ) rnm from default.asim_test1 )A WHERE rnm=1 ) B ON A.col1=B.col1 limit 10; Below is error I am getting:- Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: org.apache.hadoop.hive.serde2.io.TimestampWritable cannot be cast to java.sql.Timestamp at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:313) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164) Any advice will be appreciated.
... View more
There is a bug in Apache sqoop as mentioned in the below link. As the column label is first evaluated, so the column name is ignored. By this when we are using generic driver (not using any connection manager), the sqoop import is failing saying column not found. https://issues.apache.org/jira/browse/SQOOP-585 I am finding the same type of issue in Hortonworks as well, is there a chance of patch on this? Steps to produce the Error:- In Teradata create a table having TITLE. But use different name for TITLE (it should not be same as column name). Try a sqoop import without using connection manager, so that it will go for the generic JDBC connection. Then you can see the error as the "Column not found". Because the JAVA program first evaluating the TITLE NAME then column name. So it is thinking the TITLE NAME as column name and when querying it is failing. I can understand, this error will not happen if Teradata connection manager is used, but how can we solve it without using the Teradata connection manager.
... View more