Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig illustrate command fails after upgrade from CDH4.2.1 to CDH4.6.0 parcels

Pig illustrate command fails after upgrade from CDH4.2.1 to CDH4.6.0 parcels

Explorer

I found an interesting and unexpected problem after our recent update from CDH4.2.1 to CDH4.6.1.

 

 

 

 

Our pig scripts work but when running illustrate they fail with the error stream attached below.

 

Here are some interesting observations:

  1. The commands worked under CDH4.2.1 (Pig v0.10).
  2. If I run my programs with normal pigstorage output everything works.
  3. We did upgrade from rpms to parcels in the process (if that makes a diff)
  4. I've tried the process from the Hue pig shell and have the same results.
  5. Local mode fails the same as mapreduce mode.
  6. New version of pig is v0.11.0

 

Failure error messages when running illustrate:

2014-04-21 09:59:32,353 [main] ERROR org.apache.pig.pen.ExampleGenerator - Error reading data. Internal error creating job configuration.
java.lang.RuntimeException: Internal error creating job configuration.
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:160)
at org.apache.pig.PigServer.getExamples(PigServer.java:1182)
at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:739)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
at org.apache.pig.Main.run(Main.java:538)
at org.apache.pig.Main.main(Main.java:157)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
2014-04-21 09:59:32,358 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2997: Encountered IOException. Exception

 

I think there must be a basic config somewhere the admin's missed on upgrade.

 

 

update:  we've isolated this to the illustrate ExampleGenerator class. We're not yet sure why it cannot read the input file.

3 REPLIES 3

Re: Pig illustrate command fails after upgrade from CDH4.2.1 to CDH4.6.0 parcels

Explorer

Steps to reproduce:

 

 

I simplified the problem and here are the steps to reproduce:

 

1. create a file test.txt

              3,4,5,6

              t,6,7,7

              7,t,r,6

2. start pig local mode.   pig -x local

3. run this command:     A = load 'test.txt' using PigStorage(',') as (f1,f2,f3,f4);

4, then run:  illustrate A;

Re: Pig illustrate command fails after upgrade from CDH4.2.1 to CDH4.6.0 parcels

Explorer

Just and update:

 

I was able to replicate the entire issue in our dev cluster. 

 

I installed parcels on the v4.3.0 dev cluster. 

Retested Pig and the illustrate command worked.

Upgraded parcels to 4.6.0 (latest) 

Pig illustrate does not work.

Downgraded to v4.5.0 

Pig illustrate works.

 

I'm actually starting to do diffs between the 2 Pig code bases to see if I can see the change.

 

I don't know how I could tell if there is a general config change but it is looking like this is isolated to the v4.6.0 Pig distro.

 

 

 

Re: Pig illustrate command fails after upgrade from CDH4.2.1 to CDH4.6.0 parcels

Explorer

Another update for anyone interested in this issue.

 

The version of Pig in Cloudera CDH 4.5.0 also fails. The illustrate command worked be the PigStorage loader will fail if the CSV file contains a NULL field. 

 

example:

1,2,3,4

3,,6

3,5,6,8

 

We've rolled back to 4.4.0 and have not found additional issues at this point.