Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Load hdfs table partition file to another database : Imapala

avatar
Explorer

Hello Team,

 we have requirement as below:

1. Create table db1.t1 (partition loading_date string) as select * from db2.t1 where 1=2;

   creating table as step1 or execute full create table statements with update location

2. alter table db1.t1 set location <new path name>

3. Add partition for every day

alter table db1.t1 add  partition (loading_date='20220101') location "hdfs dir path for each partition"

4. refresh table;

5. compute stats table_name;

 

Have executed above steps in impala, data not loading from partitions files.

1 REPLY 1

avatar
Master Collaborator

@krishna2023 

It seems like you're encountering issues with loading data into partitions in Impala after executing the provided steps. 

  1. Create table as select * from db2.t1 where 1=2:

    • This step creates an empty table db1.t1 based on the schema of db2.t1 without any data. Ensure that the table schema matches between db1.t1 and db2.t1.
  2. Alter table set location:

    • After creating the empty table, you're altering its location to a new path. Make sure that the specified path exists and has the necessary permissions for Impala to read and write data.
  3. Add partition for every day:

    • Adding partitions should involve specifying the loading date for each partition and its corresponding HDFS directory path. Double-check that the HDFS directory paths specified in each partition definition are correct and accessible by Impala.
  4. Refresh table:

    • The REFRESH command updates the metadata of the table to reflect changes made in the underlying data directory. After adding partitions, running REFRESH is necessary to inform Impala about the new partitions. Make sure to execute this command after adding partitions.
  5. Compute stats:

    • The COMPUTE STATS command gathers statistics about the table, which helps Impala optimize query execution. While this command is not directly related to loading data into partitions, it's good practice to run it after making significant changes to the table.

To further troubleshoot the issue, consider the following additional steps:

  • Check Impala logs for any error messages or warnings that might indicate issues with loading data or adding partitions.
  • Verify that the data files corresponding to the partitions are present in the specified HDFS directory paths.
  • Ensure that the partitioning column (loading_date) values in the data files match the partition definitions specified in the ALTER TABLE statements.

 

Regards,

Chethan YM