Created 03-22-2016 11:54 AM
Created 03-25-2016 07:48 AM
There is no mention of bucked tables in load manual, load is a simple copy/move statement https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
Have you seen this, it explains correct way of leveraging bucketed tables
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables
The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be careful to insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query.
Created 08-09-2017 10:42 PM
You can load data into a bucketed table, but you as a user have to ensure the number of files is correct, the naming is consistent, and the content of file is properly hashed. Because if any of above is wrong, it will cause undetermined behavior when this table is used in joins where bucketing is considered.
HIVE-15148 safeguards this by introducing a new param which you can use to explicitly disable LOAD DATA on bucketed table.