Reply
Explorer
Posts: 7
Registered: ‎11-11-2014

Importing data from oracle using sqoop into a partitioned hive table as parquet

Hi 

 

I'm trying to import data from oracle to hive as parquest file, it works fine when the hive table is not partitioned, the same is failing when i choose hive-partition-key options

 

Works fine

==========

sqoop import  --connect jdbc:oracle:thin:@//xxxx --username xxx --password xxxx --table xxxx  --columns "col1,"col2" ..."colx" -m 1 --hive-import --hive-database sandbox --hive-table parq_test --as-parquetfile --null-string '\\N' --null-non-string '\\N' --hive-drop-import-delims --target-dir /tmp/sqp_xxxx --verbose

 

 

Fails

====

sqoop import --connect jdbc:oracle:thin:@//xxxxx --username xxxxx --password xxxxx --table xxxx --columns "xol1","col2",..."coln" -m 1 --hive-import --hive-database xxx --hive-table parq_test_partitions --hive-partition-key run_id --hive-partition-value "111" --as-parquetfile --null-string '\\N' --null-non-string '\\N' --hive-drop-import-delims --target-dir /tmp/sqp_xxx --verbose

 

Error message

 

Error: java.lang.IllegalArgumentException: Cannot construct key, missing provided value: run_id
at org.kitesdk.shaded.com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at org.kitesdk.data.spi.EntityAccessor.partitionValue(EntityAccessor.java:128)
at org.kitesdk.data.spi.EntityAccessor.keyFor(EntityAccessor.java:111)
at org.kitesdk.data.spi.filesystem.PartitionedDatasetWriter.write(PartitionedDatasetWriter.java:158)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:325)
at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:304)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70)
at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

 

 

Can someone hlep me with this? 

 

Regards

Suresh

New Contributor
Posts: 1
Registered: ‎03-20-2017

Re: Importing data from oracle using sqoop into a partitioned hive table as parquet

I hive encountered same issue. It seems that --hive-partition-key doesn't coexists with --as-parquetfile. Does anyone know why and how to fix it ?

New Contributor
Posts: 2
Registered: ‎09-27-2017

Re: Importing data from oracle using sqoop into a partitioned hive table as parquet

facing the same problem and can not fix it

Announcements
The Kite SDK is a collection of docs, sample code, APIs, and tools to make Hadoop application development faster. Learn more at http://kitesdk.org.