Support Questions

Find answers, ask questions, and share your expertise

Field with empty or no data causing error in pig

avatar
Expert Contributor

Apache Pig version 0.12.1.2.1.7.0-784

I have data where one of the field doesn't have data like

2015,,08
2015,,09
2015,,11
2015,,04
2015,,05

Now i run the pig command like

grunt> given_input = load '/pigtest/flightdelays/' using PigStorage(',') as (year,month,day);
grunt> ori = foreach given_input generate month;
grunt> illustrate ori;

generating error like :  Caused by: java.lang.RuntimeException: No (valid) input data found!

when i replace the loader with CSVExcelStorage like

grunt> given_input = load '/pigtest/flightdelays/' using org.apache.pig.piggybank.storage.CSVExcelStorage(',') as (year,month,day);
grunt> ori = foreach given_input generate month;
grunt> illustrate ori;

getting output like

-------------------------------------------------------------------------------
| given_input     | year:bytearray    | month:bytearray    | day:bytearray    |
-------------------------------------------------------------------------------
|                 | 2015              |                    | 05               |
-------------------------------------------------------------------------------
--------------------------------
| ori     | month:bytearray    |
--------------------------------
|         |                    |
--------------------------------

So,I would like to know

1)What is the problem with Pigstorage.

2)Is it loader problem or pig version problem.

3)If i want to use PigStoarage in this,How is should???

Not only illustrate even dump behaves the same.

1 ACCEPTED SOLUTION

avatar
Master Guru

Are you sure that dump behaves the same? If I do ( using your data 😞

a = load '/tmp/test' using PigStorage(',') as (year,month,day);

dump a;

(2015,,08)(2015,,09)...

And if I do

b = foreach a generate month;and dump b;

()()()

Looks to me pigstorage works perfectly fine with dump.

If I use illustrate everything goes wrong though. After using illustrate even the dump command fails with a nullpointer exception. So not only does it not work correctly it breaks the grunt shell until I restart it.

I think the problem is the illustrate command:

Which is not too surprising since this is the warning on top of it in the pig docs:

Illustrate:

(Note! This feature is NOT maintained at the moment. We are looking for someone to adopt it.)

View solution in original post

3 REPLIES 3

avatar
Master Guru

Are you sure that dump behaves the same? If I do ( using your data 😞

a = load '/tmp/test' using PigStorage(',') as (year,month,day);

dump a;

(2015,,08)(2015,,09)...

And if I do

b = foreach a generate month;and dump b;

()()()

Looks to me pigstorage works perfectly fine with dump.

If I use illustrate everything goes wrong though. After using illustrate even the dump command fails with a nullpointer exception. So not only does it not work correctly it breaks the grunt shell until I restart it.

I think the problem is the illustrate command:

Which is not too surprising since this is the warning on top of it in the pig docs:

Illustrate:

(Note! This feature is NOT maintained at the moment. We are looking for someone to adopt it.)

avatar
Expert Contributor

@Benjamin Leonhardi

I used dump after illustrate.So i got error.So the problem is with " illustrate " command.

Actually i have a habit to use illustrate for every pig command i used in grunt shell to check the output.

avatar
Master Guru

It looks like a very useful command for debugging. Never used it before. Shame it seems to be broken.