Support Questions
Find answers, ask questions, and share your expertise

Sqoop import into Hive as Parquet fails for decimal type

Sqoop import into Hive as Parquet fails for decimal type

New Contributor

Hello,

I am trying to import a table from MS SQL server into Hive as Parquet, and one of the columns is a decimal type. By default, Sqoop would change the type for the decimal to a double, but unfortunately that is causing precision issues for some of our calculations.

Right now, I am getting the following error running in a HDP 2.4 sandbox:

Import command:

[root@sandbox sqoop]# sqoop import -Dsqoop.avro.logical_types.decimal.enable=true --hive-import --num-mappers 1 --connect "jdbc:sqlserver://<conn_string>" --username uname --password pass --hive-overwrite --hive-database default --table SqoopDecimalTest --driver com.microsoft.sqlserver.jdbc.SQLServerDriver --null-string '\\N' --as-parquetfile

Error: org.kitesdk.data.DatasetOperationException: Failed to append {"id": 1, "price": 19.123450} to ParquetAppender{path=hdfs://sandbox.hortonworks.com:8020/tmp/default/.temp/job_1514513583437_0001/mr/attempt_1514513583437_0001_m_000000_0/.6b8d110f-6d1a-450c-93e4-c3db1a421476.parquet.tmp, schema={"type":"record","name":"SqoopDecimalTest","doc":"Sqoop import of SqoopDecimalTest","fields":[{"name":"id","type":["null","int"],"default":null,"columnName":"id","sqlType":"4"},{"name":"price","type":["null",{"type":"bytes","logicalType":"decimal","precision":19,"scale":6}],"default":null,"columnName":"price","sqlType":"3"}],"tableName":"SqoopDecimalTest"}, fileSystem=DFS[DFSClient[clientName=DFSClient_attempt_1514513583437_0001_m_000000_0_1859161154_1, ugi=root (auth:SIMPLE)]], avroParquetWriter=org.apache.parquet.avro.AvroParquetWriter@f60f96b} at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:194) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:326) at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat$DatasetRecordWriter.write(DatasetKeyOutputFormat.java:305) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:658) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:70) at org.apache.sqoop.mapreduce.ParquetImportMapper.map(ParquetImportMapper.java:39) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Caused by: java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.nio.ByteBuffer at org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:257) at org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:167) at org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:142) at org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:121) at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:288) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:74) at org.kitesdk.data.spi.filesystem.ParquetAppender.append(ParquetAppender.java:35) at org.kitesdk.data.spi.filesystem.FileSystemWriter.write(FileSystemWriter.java:188)

I am running Sqoop v.1.4.7 built against Kite v. 1.1.1-SNAPSHOT (the master branch) because I noticed that the current version 1.0.0 uses parquet-avro 1.6.0, so I thought using parquet-avro 1.8.1 might help. I get the error in both versions.

Does anyone know what might be wrong? Or, is the answer that this is simply not supported in Sqoop? Any ideas would be greatly appreciated!

Thank you,

Subhash

2 REPLIES 2

Re: Sqoop import into Hive as Parquet fails for decimal type

New Contributor

Hello @subhash_sriram,

I encounter the same issue. Did you find the solution? 

Re: Sqoop import into Hive as Parquet fails for decimal type

Community Manager

@ou As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. 

 


Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: