Support Questions

Find answers, ask questions, and share your expertise

Please check that it is a valid Parquet file. This error can also occur due to stale metadata. If you believe this is a valid Parquet file, try running "refresh jmaster.hivetesttable".

avatar
Explorer

Hello,

I just installed a new Hadoop server which has hive and impala servers on it.

Cloudera Runtime 7.1.8
Server version: impalad version 4.0.0.7.1.8.0-801 RELEASE (build a3b56f90d9c31ebfa5ce3c266700284a420db28f)

However, I am getting an error when try to import data from .cvs file from Impala shell:

The error is:

Query: INSERT OVERWRITE TABLE ATABLE SELECT * FROM JMASTER.ATABLE
Query submitted at: 2024-02-14 14:00:53 (Coordinator: http://hiveservername.domainame.com:25000)
Query progress can be monitored at: http://hiveservername.domainame.com:25000/query_plan?query_id=0248306127fdcbd1:36d35f5a00000000
ERROR: File 'hdfs://hiveservername.domainame.com:8020/user/hivetest/master_tables/hivetesttable/ATABLE.csv' has an invalid Parquet version number: 30 30 30 30 .
Please check that it is a valid Parquet file. This error can also occur due to stale metadata. If you believe this is a valid Parquet file, try running "refresh jmaster.hivetesttable".

Could not execute command: INSERT OVERWRITE TABLE ATABLE SELECT * FROM JMASTER.ATABLE
 
I can import data fine to Impala 3.x version.
 
Does anyone know what the issue is?
 
Thank you!
11 REPLIES 11

avatar
Community Manager

@echodot Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Impala experts @jAnshula @ezerihun  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Explorer

Thanks @DianaTorres@jAnshula @ezerihun Please let me know if you need more information.

avatar
Expert Contributor

HI @echodot 

Did you try to execute refresh command on table -->  jmaster.hivetesttable.

Does same error persist, after executing refresh command?

If the problem still persist, then execute below steps for confirming that the file is valid.

--> Download the partition or file in local disk.
--> Execute below command to validate the file
 parquet-tools meta <full_path_to_file_name_in_the_local_disk>

 

avatar
Explorer

Hi @jAnshula 

I have run the refresh command on the table, but it did not help.

Here is the output after run the command parquet-tools meta

parquet-tools meta /user/hivetest/master_tables/hivetesttable/create_tables.sql

java.io.IOException: Could not read footer: java.lang.RuntimeException: file:/user/hivetest/master_tables/hivetesttable/create_tables.sql is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [10, 41, 59, 10]

Please suggest what I need to be done to fix the issue.

Thank you!

 

avatar
Expert Contributor

Hi @echodot 

As the source table is not in parquet format, try ing below command to create and load the data

CREATE TABLE jmaster.hivetesttable AS SELECT * FROM JMASTER.ATABLE STORED AS PARQUET;

avatar
Explorer

Hi @jAnshula 

I created the table as parquet. However, I am still getting issue after insert data. Here is my query to create the table and insert to the table:

 

CREATE TABLE ATABLE
(
CHARCOL CHAR(10),
VCHARCOL VARCHAR(10),
DECIMALCOL DECIMAL(15,5),
NUMERICCOL DOUBLE,
SMALLCOL SMALLINT,
INTEGERCOL INT,
REALCOL FLOAT,
FLOATCOL FLOAT,
DOUBLECOL DOUBLE,
LVCOL STRING,
BITCOL BOOLEAN,
TINYINTCOL TINYINT,
BIGINTCOL BIGINT,
BINCOL STRING,
VARBINCOL STRING,
LVARBINCOL STRING,
DATECOL STRING,
TIMECOL STRING,
TSCOL TIMESTAMP
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS PARQUET;

Then insert data to the table: INSERT OVERWRITE TABLE ATABLE SELECT * FROM JMASTER.ATABLE

and the result:

Query: INSERT OVERWRITE TABLE ATABLE SELECT * FROM JMASTER.ATABLE
Query submitted at: 2024-02-16 09:03:11 (Coordinator: http://hiveservername.domainame.com:25000)
Query progress can be monitored at: http://hiveservername.domainame.com:25000/query_plan?query_id=7042592eded60891:ee0cba7600000000
ERROR: File 'hdfs://hiveservername.domainame.com:8020/user/hivetest/master_tables/hivetesttable/ATABLE.csv' has an invalid Parquet version number: 30 30 30 30 .
Please check that it is a valid Parquet file. This error can also occur due to stale metadata. If you believe this is a valid Parquet file, try running "refresh jmaster.hivetesttable".

avatar
Explorer

@jAnshula Hello, any help from my provided information above?

avatar
Expert Contributor

 @echodot 

 

Try to execute the queries below and then run your query.

# alter table ATABLE set tblproperties('impala.disable.recursive.listing'='true');

# refresh ATABLE;

# check if your query works or not

avatar
Explorer

@jAnshulaI am getting below error when running your first query:

ERROR: AnalysisException: ALTER TABLE not supported on transactional (ACID) table: default.atable