Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Explorer

Hello , I created an external hive table on a parquet file and when I run a select * , I am seeing below error

I am running hive on EMR cluster - Hive 2.3.2-amzn-2. I verified that all the fields exist in the parquet file

Did anyone encounter this issue?

Any suggestions would be appreciated

does not contain requested field: optional boolean prepaidFlag:25:24', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:499', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:307', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:878', 'sun.reflect.GeneratedMethodAccessor15:invoke::-1', 'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43', 'java.lang.reflect.Method:invoke:Method.java:498', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36', 'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63', 'java.security.AccessController:doPrivileged:AccessController.java:-2', 'javax.security.auth.Subject:doAs:Subject.java:422', 'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1836', 'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59', 'com.sun.proxy.$Proxy35:fetchResults::-1', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:559', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:751', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717', 'org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*java.io.IOException:java.lang.IllegalStateException: Group type

7 REPLIES 7

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Please share the Create statement from hive table.

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Explorer

Hi Venkat, Below is the create table statement.

The parquet file has all the columns listed and the data types match the schema

Create external table Hive_Parquet_Test( 
  statement_Id int,
  statement_MessageId string,
  prepaidFlag boolean,
  item_Count int,
  first_Name string,
  last_Name string
  )
STORED AS PARQUET
LOCATION 's3://bucket_name/hive_parq_test'
Highlighted

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Did the source file is also in parquet file format?

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Explorer

Yes, it is parquet. here is some additional information

Environment:

  • Running hive on AWS EMR (emr-5.13.0) cluster - Hive 2.3.2-amzn-2.
  • Verified that all the fields exist in the parquet file using parquet tools.
  • Parquet file is generated from nested json using fast-parquet python library

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Get the schema by using parquet-tool schema filename in hdfs and use that schema to build external table in hive and use parquet serde at time of creation of hive table.

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

Explorer

thank you again for your inputs.

yes, I extracted the schema from parquet file and and created external table

I am not clear on your comment " use parquet serde at time of creation of hive table." based on the hive documentation, I am using, STORED AS PARQUET (Hive 2.3.2-amzn-2)

Also I am not sure if the conversion using fast-parquet python library is causing it or if this is a bug in hive

Re: Hive - Error when selecting data from external hive table on parquet file - does not contain requested field: optional boolean

I told you to create the hive table with serde

Syntax:

create table <dbname>.<tablename)(a string,b string,c string,d string,e string,f string,g string,h string,i string,j string) ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat";

Don't have an account?
Coming from Hortonworks? Activate your account here