Created on 02-20-2017 02:59 PM - edited 08-19-2019 03:56 AM
Hello,
I recently started working on sqoop and HDFS related stuffs and I faced an issue by using --map-column hive
The datatype of the specified column doesn't change and below is the sqoop commmand to hdfs as parquet file,
sqoop import -Dmapreduce.job.queuename=xxxxx--connect jdbc:oracle:thin:@xxxx:1521:xxx--username xxxx--password xxx --query "SELECT * FROM imdb_pub_f_initiatives_v WHERE F_Ini_CouNTRY_CODE IN ('FR','XGL') AND \$CONDITIONS" --split-by f_ini_code -m 20 --map-column-hive "F_INI_APPEARANCE_DATE=date" --as-parquetfile --delete-target-dir --target-dir /user/ramaga09/gpub/FR/initiative/imdb_pub_f_initiatives_v2I am running Sqoop version: 1.4.6-cdh5.5.2m, when I checked in the _metadata files in HDFS and I could see as "long" changes were not reflecting.
Please correct me if am doing any wrong.
Many Thanks,
Ganeshbabu R
Created 02-20-2017 03:18 PM
@Ganeshbabu Ramamoorthy can you remove the double quotes and try again? Also, you'd mentioned CDH, what version of Hive is it? Date type was only introduced in Hive 0.12, chance is CDH does not include it? https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Date/TimeT...
--map-column-hive F_INI_APPEARANCE_DATE=date
I also recommend looking at the Java class code generated after you run this job in the directory you execute sqoop from, in there you should be able to find the type it generates.
Created 02-20-2017 04:03 PM
@Artem Ervits I removed those double quotes and tried again with the same sqoop command but still gettting the same datatype was not changed.
we are using Hive version is 1.1.0
I checked the java class code in the sqoop execution path but couldn't see any changes related to that command and below is the warning logs showed during sqoop,
Warning: /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p0.4/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. 17/02/20 10:58:44 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.2 17/02/20 10:58:44 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 17/02/20 10:58:44 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following: 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --hive-home 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --hive-overwrite 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --create-hive-table 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --hive-table 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --hive-partition-key 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --hive-partition-value 17/02/20 10:58:44 WARN tool.BaseSqoopTool: --map-column-hive 17/02/20 10:58:44 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that 17/02/20 10:58:44 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either 17/02/20 10:58:44 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them 17/02/20 10:58:44 WARN tool.BaseSqoopTool: from command line to remove this warning. 17/02/20 10:58:44 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key, 17/02/20 10:58:44 INFO tool.BaseSqoopTool: hive-partition-value and --map-column-hive options are 17/02/20 10:58:44 INFO tool.BaseSqoopTool: are also valid for HCatalog imports and exports 17/02/20 10:58:44 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled. 17/02/20 10:58:44 INFO manager.SqlManager: Using default fetchSize of 1000 17/02/20 10:58:44 INFO tool.CodeGenTool: Beginning code generation
Not sure we need to include we need to add those parameter..
Created on 02-20-2017 04:24 PM - edited 08-19-2019 03:56 AM
Yes I could see the java class changes related the date field (F_INI_APPEARANCE_DATE) and below is the code generated in the class file,
Let me know your thoughts on this..
But why type is not showing in the .metadata file, attached the sample image,
sample code:-
 private java.sql.Timestamp F_INI_APPEARANCE_DATE;
  public java.sql.Timestamp get_F_INI_APPEARANCE_DATE() {
    return F_INI_APPEARANCE_DATE;
  }
public void set_F_INI_APPEARANCE_DATE(java.sql.Timestamp F_INI_APPEARANCE_DATE) {
    this.F_INI_APPEARANCE_DATE = F_INI_APPEARANCE_DATE;
  }
  public QueryResult with_F_INI_APPEARANCE_DATE(java.sql.Timestamp F_INI_APPEARANCE_DATE) {
    this.F_INI_APPEARANCE_DATE = F_INI_APPEARANCE_DATE;
    return this;
  }
					
				
			
			
				
			
			
			
			
			
			
			
		Created 02-20-2017 05:02 PM
The avro in Hive 1.1 is Avro 1.7.4, it does not support date types, those were introduced in Avro 1.8 that explains why they are not showing. You can confirm by looking in your Hive lib directory for specific avro version.
https://issues.apache.org/jira/browse/AVRO-739
I haven't tried it myself but if you can coerce sqoop into using avro 1.8.x maybe you can achieve what you want with date type, maybe try adding it to sqoop lib folder? If my answers were at all useful, please consider accepting the answer as best.
Created 02-20-2017 06:36 PM
@Artem Ervits I will do the changes in our sqoop lib folder and will post my response back to you 🙂
Created 01-14-2020 11:22 PM
--as-textfile
Maps all the columns to their respective datatypes mentioned in –map-column-hive
--as-parquet-file
Does not change any datatype for the columns mentioned in –map-column-hive
Please reply on this, if you have got answer
 
					
				
				
			
		
