Created on 02-15-2020 01:32 AM - last edited on 02-15-2020 05:16 AM by ask_bill_brooks
Hi,
I was previously working Hive 1.2.1000.2.6.5.0-292. When i create a ORC table like below:
CREATE TABLE acidtbl2 (a INT, b STRING) STORED AS orc tblproperties ("orc.compress"="SNAPPY");
And insert data into it:
INSERT INTO acidtbl2 (a,b) VALUES (100, "oranges"), (200, "apples"), (300,"bananas");
I get a the following directory which contains the data file as below:
hdfs dfs -ls /apps/hive/warehouse/imp_rt_test1912.db/acidtbl1/
Found 1 items
-rwxrwxrwx 3 bdauser hdfs 380 2020-02-15 10:52 /apps/hive/warehouse/imp_rt_test1912.db/acidtbl2/000000_0
Now with Hive 3.1 if I do the same , I get a delta folder and withing it i see a file named bucket_00000
hdfs dfs -ls /warehouse/tablespace/managed/hive/hdptesta.db/acidtbl2/
Found 1 items
drwxrwx---+ - bdauser hadoop 0 2020-02-15 10:59 /warehouse/tablespace/managed/hive/hdptesta.db/acidtbl2/delta_0000001_0000001_0000
Is there a way to have the bucketing disabled so that bucket files dont get generated?
Thanks,
Kevin
Created 01-25-2021 02:52 AM
@kevinmat0510 Hive 3 architecture is changed to support ACID v2 and Hive 3 buckets generation is automatic which splits data implicitly.
https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/using-hiveql/content/hive_3_internals.html
You can't disable the generation of buckets, it's a complete architecture change of Hive 3 and refer the Hive 3 ACID support details in the document.
Thanks,
Prakash
Created 01-25-2021 03:28 AM
@kevinmat0510 You have to delete the files from HDFS use this command::hdfs dfs -rm -rf /user/hive/.yarn/package/LLAP