Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Truncate hive table from pig

Truncate hive table from pig

Expert Contributor

Is there a way to truncate hive table from pig?

I need to overwrite a temp table everytime I load the data into it, but I am unable to find anything about how to overwrite the contents, only way I could think of is to remove the files in that table directory manually and rewriting it using Store interface.

Is there a better way to do this? Please advise.

Thanks!

7 REPLIES 7

Re: Truncate hive table from pig

Mentor

As long as it's an hcatalog table you can access it through Hive or pig, I don't see why you couldn't truncate a table in pig script

http://hortonworks.com/hadoop-tutorial/how-to-use-hcatalog-basic-pig-hive-commands/

I have a sample pig script that executes Hive queries

https://github.com/dbist/oozie/blob/master/apps/hcatalog/sample.pig

Highlighted

Re: Truncate hive table from pig

Expert Contributor

I am able to read and write to the table, but the writes are always append.

I could not truncate the Hive table.

Re: Truncate hive table from pig

Mentor

taking my script please confirm whether you can execute the command

sql show tables
sql truncate table tablename

Re: Truncate hive table from pig

Expert Contributor

My bad, actually I didnt look at your sample script properly. I tried these commands. But only show tables work. I tried select * from table and truncate table tablename it didnt work. Please find the error below.

grunt> sql select * from temp_dim_w;
2016-05-18 20:24:13,705 [main] INFO  org.apache.pig.tools.grunt.GruntParser - Going to run hcat command: select * from temp_dim_w;
WARNING: Use "yarn jar" to launch YARN applications.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
16/05/18 20:24:15 WARN conf.HiveConf: HiveConf of name hive.metastore.pre-event.listeners does not exist
16/05/18 20:24:15 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
FAILED: SemanticException Operation not supported.
grunt> sql truncate table temp_dim_w;
2016-05-18 20:24:31,858 [main] INFO  org.apache.pig.tools.grunt.GruntParser - Going to run hcat command: truncate table temp_dim_w;
WARNING: Use "yarn jar" to launch YARN applications.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
16/05/18 20:24:33 WARN conf.HiveConf: HiveConf of name hive.metastore.pre-event.listeners does not exist
16/05/18 20:24:33 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
FAILED: SemanticException Operation not supported.

Re: Truncate hive table from pig

Mentor

is this table an HCatalog table? It won't work on non-hcat table.

Re: Truncate hive table from pig

Expert Contributor

Due to some reason, I am unable to post reply to your last comment.

yes the table is in Hcat. I was able to check hcat -e "describe table;", its just truncate thats now working.

hcat -e "truncate table tablename;" is giving the same error that I got from pig. According to https://cwiki.apache.org/confluence/display/Hive/HCatalog+CLI all commands that does not require map reduce job should run through hcat right?

It works. All hive managed tables are hcat tables right?

Re: Truncate hive table from pig

Mentor

@R M getting back to this just now as I had a similar thread open with someone else. Turns out truncate and delete is not supported by API. You can do the following though

set hcat.bin /usr/bin/hcat;
sql drop table if exists codes;
sql create table codes(code string, description string, total_emp int, salary int);
a = load 'sample_07' using org.apache.hive.hcatalog.pig.HCatLoader();
b = load 'sample_08' using org.apache.hive.hcatalog.pig.HCatLoader();
c = join b by code, a by code;
d = foreach c generate $0 as code, $1 as description, $2 as total_emp, $3 as salary;
store d into 'codes' using org.apache.hive.hcatalog.pig.HCatStorer();