Member since
07-23-2024
1
Post
0
Kudos Received
0
Solutions
08-16-2024
08:54 AM
Hi, I use following in my cdp cluster. While I create a simple parquet table from Hue and store it in external table on s3 express bucket, it shows 'EOFException'. 1. Hive 3.1.3000.7.2.18.200-39, Hadoop 3.1.1.7.2.18.200-39 2. Data hub CM version 7.12.0.200, CM runtime version: 7.2.18-1.cdh7.2.18.p200.54625612 I have tried following setting but didn't help. Could anyone help here? Thank you. 1. Data cluster -> CM -> hdfs -> add fs.s3a.experimental.input.fadvise=sequential 2. Data cluster -> CM -> spark3_on_yarn -> add spark.sql.parquet.enableVectorizedReader=false INFO : Compiling command(queryId=hive_20240816153040_ebe87870-3c90-4f07-84dc-b2b6e354520c): select * from small
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:small.id, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20240816153040_ebe87870-3c90-4f07-84dc-b2b6e354520c); Time taken: 0.173 seconds
INFO : Executing command(queryId=hive_20240816153040_ebe87870-3c90-4f07-84dc-b2b6e354520c): select * from small
INFO : Completed executing command(queryId=hive_20240816153040_ebe87870-3c90-4f07-84dc-b2b6e354520c); Time taken: 0.009 seconds
INFO : OK
ERROR : Failed with exception java.io.IOException:java.io.EOFException: Reached the end of stream with 340 bytes left to read
java.io.IOException: java.io.EOFException: Reached the end of stream with 340 bytes left to read
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:642)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:549)
at org.apache.hadoop.hive.ql.exec.FetchTask.executeInner(FetchTask.java:217)
at org.apache.hadoop.hive.ql.exec.FetchTask.execute(FetchTask.java:114)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:820)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:550)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:544)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:92)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:340)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:360)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.EOFException: Reached the end of stream with 340 bytes left to read
at org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:104)
at org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
at org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:584)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:536)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:530)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:478)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:462)
at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getParquetMetadata(ParquetRecordReaderBase.java:181)
at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.setupMetadataAndParquetSplit(ParquetRecordReaderBase.java:87)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:59)
at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:93)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:789)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:353)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:580)
... 21 more
... View more
Labels: