Support Questions
Find answers, ask questions, and share your expertise

Truncate hive table from pig

Truncate hive table from pig

Expert Contributor

Is there a way to truncate hive table from pig?

I need to overwrite a temp table everytime I load the data into it, but I am unable to find anything about how to overwrite the contents, only way I could think of is to remove the files in that table directory manually and rewriting it using Store interface.

Is there a better way to do this? Please advise.

Thanks!

7 REPLIES 7

Re: Truncate hive table from pig

Mentor

As long as it's an hcatalog table you can access it through Hive or pig, I don't see why you couldn't truncate a table in pig script

http://hortonworks.com/hadoop-tutorial/how-to-use-hcatalog-basic-pig-hive-commands/

I have a sample pig script that executes Hive queries

https://github.com/dbist/oozie/blob/master/apps/hcatalog/sample.pig

Re: Truncate hive table from pig

Expert Contributor

I am able to read and write to the table, but the writes are always append.

I could not truncate the Hive table.

Re: Truncate hive table from pig

Mentor

taking my script please confirm whether you can execute the command

sql show tables
sql truncate table tablename

Re: Truncate hive table from pig

Expert Contributor

My bad, actually I didnt look at your sample script properly. I tried these commands. But only show tables work. I tried select * from table and truncate table tablename it didnt work. Please find the error below.

grunt> sql select * from temp_dim_w;
2016-05-18 20:24:13,705 [main] INFO  org.apache.pig.tools.grunt.GruntParser - Going to run hcat command: select * from temp_dim_w;
WARNING: Use "yarn jar" to launch YARN applications.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
16/05/18 20:24:15 WARN conf.HiveConf: HiveConf of name hive.metastore.pre-event.listeners does not exist
16/05/18 20:24:15 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
FAILED: SemanticException Operation not supported.
grunt> sql truncate table temp_dim_w;
2016-05-18 20:24:31,858 [main] INFO  org.apache.pig.tools.grunt.GruntParser - Going to run hcat command: truncate table temp_dim_w;
WARNING: Use "yarn jar" to launch YARN applications.
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
16/05/18 20:24:33 WARN conf.HiveConf: HiveConf of name hive.metastore.pre-event.listeners does not exist
16/05/18 20:24:33 WARN conf.HiveConf: HiveConf of name hive.semantic.analyzer.factory.impl does not exist
FAILED: SemanticException Operation not supported.

Re: Truncate hive table from pig

Mentor

is this table an HCatalog table? It won't work on non-hcat table.

Re: Truncate hive table from pig

Expert Contributor

Due to some reason, I am unable to post reply to your last comment.

yes the table is in Hcat. I was able to check hcat -e "describe table;", its just truncate thats now working.

hcat -e "truncate table tablename;" is giving the same error that I got from pig. According to https://cwiki.apache.org/confluence/display/Hive/HCatalog+CLI all commands that does not require map reduce job should run through hcat right?

It works. All hive managed tables are hcat tables right?

Re: Truncate hive table from pig

Mentor

@R M getting back to this just now as I had a similar thread open with someone else. Turns out truncate and delete is not supported by API. You can do the following though

set hcat.bin /usr/bin/hcat;
sql drop table if exists codes;
sql create table codes(code string, description string, total_emp int, salary int);
a = load 'sample_07' using org.apache.hive.hcatalog.pig.HCatLoader();
b = load 'sample_08' using org.apache.hive.hcatalog.pig.HCatLoader();
c = join b by code, a by code;
d = foreach c generate $0 as code, $1 as description, $2 as total_emp, $3 as salary;
store d into 'codes' using org.apache.hive.hcatalog.pig.HCatStorer();