Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

--map-column hive not creating specific column dataype

avatar

Hello,

I recently started working on sqoop and HDFS related stuffs and I faced an issue by using --map-column hive

The datatype of the specified column doesn't change and below is the sqoop commmand to hdfs as parquet file,

sqoop import -Dmapreduce.job.queuename=xxxxx--connect jdbc:oracle:thin:@xxxx:1521:xxx--username xxxx--password xxx --query "SELECT * FROM imdb_pub_f_initiatives_v WHERE F_Ini_CouNTRY_CODE IN ('FR','XGL') AND \$CONDITIONS" --split-by f_ini_code -m 20 --map-column-hive "F_INI_APPEARANCE_DATE=date" --as-parquetfile --delete-target-dir --target-dir /user/ramaga09/gpub/FR/initiative/imdb_pub_f_initiatives_v2

I am running Sqoop version: 1.4.6-cdh5.5.2m, when I checked in the _metadata files in HDFS and I could see as "long" changes were not reflecting.

Please correct me if am doing any wrong.

12688-sqoop-map-column-hive-date-attribute-issue.png

Many Thanks,

Ganeshbabu R

6 REPLIES 6

avatar
Master Mentor

@Ganeshbabu Ramamoorthy can you remove the double quotes and try again? Also, you'd mentioned CDH, what version of Hive is it? Date type was only introduced in Hive 0.12, chance is CDH does not include it? https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-Date/TimeT...

--map-column-hive F_INI_APPEARANCE_DATE=date

I also recommend looking at the Java class code generated after you run this job in the directory you execute sqoop from, in there you should be able to find the type it generates.

avatar

@Artem Ervits I removed those double quotes and tried again with the same sqoop command but still gettting the same datatype was not changed.

we are using Hive version is 1.1.0

I checked the java class code in the sqoop execution path but couldn't see any changes related to that command and below is the warning logs showed during sqoop,

Warning: /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p0.4/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/02/20 10:58:44 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.2
17/02/20 10:58:44 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/02/20 10:58:44 WARN tool.BaseSqoopTool: It seems that you've specified at least one of following:
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --hive-home
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --hive-overwrite
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --create-hive-table
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --hive-table
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --hive-partition-key
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --hive-partition-value
17/02/20 10:58:44 WARN tool.BaseSqoopTool:      --map-column-hive
17/02/20 10:58:44 WARN tool.BaseSqoopTool: Without specifying parameter --hive-import. Please note that
17/02/20 10:58:44 WARN tool.BaseSqoopTool: those arguments will not be used in this session. Either
17/02/20 10:58:44 WARN tool.BaseSqoopTool: specify --hive-import to apply them correctly or remove them
17/02/20 10:58:44 WARN tool.BaseSqoopTool: from command line to remove this warning.
17/02/20 10:58:44 INFO tool.BaseSqoopTool: Please note that --hive-home, --hive-partition-key,
17/02/20 10:58:44 INFO tool.BaseSqoopTool:       hive-partition-value and --map-column-hive options are
17/02/20 10:58:44 INFO tool.BaseSqoopTool:       are also valid for HCatalog imports and exports
17/02/20 10:58:44 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
17/02/20 10:58:44 INFO manager.SqlManager: Using default fetchSize of 1000
17/02/20 10:58:44 INFO tool.CodeGenTool: Beginning code generation

Not sure we need to include we need to add those parameter..

avatar

Hi @Artem Ervits

Yes I could see the java class changes related the date field (F_INI_APPEARANCE_DATE) and below is the code generated in the class file,

Let me know your thoughts on this..

But why type is not showing in the .metadata file, attached the sample image,

12689-sqoop-map-column-hive-date-attribute-issue.png

sample code:-

 private java.sql.Timestamp F_INI_APPEARANCE_DATE;
  public java.sql.Timestamp get_F_INI_APPEARANCE_DATE() {
    return F_INI_APPEARANCE_DATE;
  }

public void set_F_INI_APPEARANCE_DATE(java.sql.Timestamp F_INI_APPEARANCE_DATE) {
    this.F_INI_APPEARANCE_DATE = F_INI_APPEARANCE_DATE;
  }
  public QueryResult with_F_INI_APPEARANCE_DATE(java.sql.Timestamp F_INI_APPEARANCE_DATE) {
    this.F_INI_APPEARANCE_DATE = F_INI_APPEARANCE_DATE;
    return this;
  }

avatar
Master Mentor

The avro in Hive 1.1 is Avro 1.7.4, it does not support date types, those were introduced in Avro 1.8 that explains why they are not showing. You can confirm by looking in your Hive lib directory for specific avro version.

https://issues.apache.org/jira/browse/AVRO-739

I haven't tried it myself but if you can coerce sqoop into using avro 1.8.x maybe you can achieve what you want with date type, maybe try adding it to sqoop lib folder? If my answers were at all useful, please consider accepting the answer as best.

avatar

@Artem Ervits I will do the changes in our sqoop lib folder and will post my response back to you 🙂

avatar
New Contributor

--as-textfile
Maps all the columns to their respective datatypes mentioned in –map-column-hive
--as-parquet-file
Does not change any datatype for the columns mentioned in –map-column-hive

 

Please reply on this, if you have got answer