Member since
10-13-2016
9
Posts
2
Kudos Received
0
Solutions
02-08-2017
03:15 AM
1 Kudo
1)Data in hadoop in Orc format 2)Content is the one of the column in a table which is clob in oracle and which is a string in hive. This is not query is not working" SELECT * FROM database.table where length(CONTENT) = 142418026"; getting as like below error message like below. 3)As per my investigation if clob size more than 500mb data not displayed in hive. 4) Tried in spark rdds that one also not working.Kindly let me know if any other procedure we can follow. Hive Error: Task with the most failures(4):
-----
Task ID:
task_1485019777366_228562_m_000040
URL:
http://hpchdd2.hpc.ford.com:8088/taskdetails.jsp?jobid=job_1485019777366_228562&tipid=task_1485019777366_228562_m_000040
-----
Diagnostic Messages for this Task:
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
... 11 more
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit.
at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
at com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.<init>(OrcProto.java:1317)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics.<init>(OrcProto.java:1281)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1374)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics$1.parsePartialFrom(OrcProto.java:1369)
at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.<init>(OrcProto.java:4897)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.<init>(OrcProto.java:4813)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:5005)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics$1.parsePartialFrom(OrcProto.java:5000)
at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.<init>(OrcProto.java:6796)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.<init>(OrcProto.java:6722)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry$1.parsePartialFrom(OrcProto.java:6837)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry$1.parsePartialFrom(OrcProto.java:6832)
at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.<init>(OrcProto.java:7446)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.<init>(OrcProto.java:7393)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex$1.parsePartialFrom(OrcProto.java:7482)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex$1.parsePartialFrom(OrcProto.java:7477)
at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:217)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:223)
at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49)
at org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.parseFrom(OrcProto.java:7593)
at org.apache.hadoop.hive.ql.io.orc.MetadataReader.readRowIndex(MetadataReader.java:88)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1194)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readRowIndex(RecordReaderImpl.java:1179)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:776)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:803)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1013)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1046)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:222)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:87)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:176)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1239)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1252)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
... 16 more
... View more
Labels:
- Labels:
-
Apache Hive
11-19-2016
04:13 AM
its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?
... View more
11-19-2016
04:12 AM
Thank you staca ...its working fine.But out put xml file saved in 10 files but I need in single file.How can we do that?
... View more
11-16-2016
01:32 AM
1 Kudo
How to convert spark data frames into xml files in scala. Very large data need to change from oracle database into xml files how can we do that?
... View more
Labels:
- Labels:
-
Apache Spark