Support Questions
Find answers, ask questions, and share your expertise

Field with empty or no data causing error in pig

Solved Go to solution

Field with empty or no data causing error in pig

Rising Star

Apache Pig version 0.12.1.2.1.7.0-784

I have data where one of the field doesn't have data like

2015,,08
2015,,09
2015,,11
2015,,04
2015,,05

Now i run the pig command like

grunt> given_input = load '/pigtest/flightdelays/' using PigStorage(',') as (year,month,day);
grunt> ori = foreach given_input generate month;
grunt> illustrate ori;

generating error like :  Caused by: java.lang.RuntimeException: No (valid) input data found!

when i replace the loader with CSVExcelStorage like

grunt> given_input = load '/pigtest/flightdelays/' using org.apache.pig.piggybank.storage.CSVExcelStorage(',') as (year,month,day);
grunt> ori = foreach given_input generate month;
grunt> illustrate ori;

getting output like

-------------------------------------------------------------------------------
| given_input     | year:bytearray    | month:bytearray    | day:bytearray    |
-------------------------------------------------------------------------------
|                 | 2015              |                    | 05               |
-------------------------------------------------------------------------------
--------------------------------
| ori     | month:bytearray    |
--------------------------------
|         |                    |
--------------------------------

So,I would like to know

1)What is the problem with Pigstorage.

2)Is it loader problem or pig version problem.

3)If i want to use PigStoarage in this,How is should???

Not only illustrate even dump behaves the same.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Field with empty or no data causing error in pig

Are you sure that dump behaves the same? If I do ( using your data ):

a = load '/tmp/test' using PigStorage(',') as (year,month,day);

dump a;

(2015,,08)(2015,,09)...

And if I do

b = foreach a generate month;and dump b;

()()()

Looks to me pigstorage works perfectly fine with dump.

If I use illustrate everything goes wrong though. After using illustrate even the dump command fails with a nullpointer exception. So not only does it not work correctly it breaks the grunt shell until I restart it.

I think the problem is the illustrate command:

Which is not too surprising since this is the warning on top of it in the pig docs:

Illustrate:

(Note! This feature is NOT maintained at the moment. We are looking for someone to adopt it.)

View solution in original post

3 REPLIES 3
Highlighted

Re: Field with empty or no data causing error in pig

Are you sure that dump behaves the same? If I do ( using your data ):

a = load '/tmp/test' using PigStorage(',') as (year,month,day);

dump a;

(2015,,08)(2015,,09)...

And if I do

b = foreach a generate month;and dump b;

()()()

Looks to me pigstorage works perfectly fine with dump.

If I use illustrate everything goes wrong though. After using illustrate even the dump command fails with a nullpointer exception. So not only does it not work correctly it breaks the grunt shell until I restart it.

I think the problem is the illustrate command:

Which is not too surprising since this is the warning on top of it in the pig docs:

Illustrate:

(Note! This feature is NOT maintained at the moment. We are looking for someone to adopt it.)

View solution in original post

Highlighted

Re: Field with empty or no data causing error in pig

Rising Star

@Benjamin Leonhardi

I used dump after illustrate.So i got error.So the problem is with " illustrate " command.

Actually i have a habit to use illustrate for every pig command i used in grunt shell to check the output.

Highlighted

Re: Field with empty or no data causing error in pig

It looks like a very useful command for debugging. Never used it before. Shame it seems to be broken.