Support Questions

Find answers, ask questions, and share your expertise

pig and hive store

avatar
Master Collaborator

Hi:

After get data with pig from hive, now i am inserting with this command

F = STORE E INTO 'journey_pig' USING org.apache.hive.hcatalog.pig.HCatStorer();

the F has this records:

(STR03CON,3190,2015-12-06 00,9992,2015,12,1)
(STS01OON,3081,2015-12-06 00,9154,2015,12,1)
(VAO13MOU,3076,2015-12-06 00,9554,2015,12,1)
(VMP71MOU,9998,2015-12-06 00,0001,2015,12,11)

and the error is:

2016-02-15 17:22:42,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.plan.VisitorException: ERROR 1115:
<file store_journey.pig, line 36, column 4> Output Location Validation Failed for: 'journey_pig More info to follow:
Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:64)
        at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
        at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
        at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
        at org.apache.pig.PigServer.execute(PigServer.java:1356)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
        at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:749)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        at org.apache.pig.Main.run(Main.java:502)
        at org.apache.pig.Main.main(Main.java:177)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateAlias(HCatBaseStorer.java:612)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateSchema(HCatBaseStorer.java:514)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.doSchemaValidations(HCatBaseStorer.java:495)
        at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:201)
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:57)
        ... 29 more


Where i need to put the column name???

Than ks

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi:

Finally it worked like this:

e = load 'hdfs://localhost:8020/tmp/jofi_pig_temp' using PigStorage(',') AS (codtf : chararray,codnrbeenf : chararray, fechaoprcnf : chararray, codinternouof : chararray, year : chararray, month : chararray, frecuencia : int);

Many thanks

View solution in original post

12 REPLIES 12

avatar
Master Collaborator

Hi:

Finally it worked like this:

e = load 'hdfs://localhost:8020/tmp/jofi_pig_temp' using PigStorage(',') AS (codtf : chararray,codnrbeenf : chararray, fechaoprcnf : chararray, codinternouof : chararray, year : chararray, month : chararray, frecuencia : int);

Many thanks

avatar
Master Mentor

@Roberto Sancho I have accepted this answer. Thanks for following up.

You had to define the schema as suggested in the reply.

avatar
Explorer

Actually I refer to a complete schema as soon as possible.

First when I describe it when loading the hive table, then when I transform data.

Works fine for me when I need to clean fields with complex criterias within the same table.

A = LOAD 'hive_table' USING org.apache.hive.hcatalog.pig.HCatLoader() as (f0: chararray,f1: chararray,f2: chararray;
B = FOREACH A GENERATE $0 as (f0:chararray),$1 as (f1:chararray),REPLACE($2,'John Doe','Mr Bean') as (f2:chararray);
STORE B INTO 'hive_table' using org.apache.hive.hcatalog.pig.HCatStorer();