Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig Count Distinct Error


Pig Count Distinct Error

New Contributor

I’m trying to use pig to do a group by and count distinct on a dataset and I am getting a java error saying “java.lang.ClassCastException: cannot be cast to”

Example Dataset

A B ids flag status
foo f 1001 1 K
foo f 1001 1 K
foo c 1002 1 H
bar g 1001 1 J
bar g 1002 P
bar g 1003 1 L

Here is an example of my code

testtable = LOAD 'landing.testtable' USING org.apache.hive.hcatalog.pig.HCatLoader;

filtertable = filter testtable by flag != ' ' AND status != ‘P';

grpcount = FOREACH (GROUP filtertable by (A, B)) {

uniqueids = Distinct(filtertable.ids);


group.A As A_group,

group.B As B_group,

COUNT(uniqueids) AS id_count; }

STORE grpcount INTO 'landing.grpcount USING org.apache.hive.hcatalog.pig.HCatStorer();

This is where I get the error “java.lang.ClassCastException: cannot be cast to” Not exactly sure what’s wrong here (hive table is properly setup with the right datatypes as well). I assume its erroring out on grpcount but Im not exactly sure why.

But I am basically trying to duplicate this SQL Code in Pig


A AS A_group,

B AS B_group,

count(distinct ids) As id_count

From landing.testtable

Where flag != ' '

And status not in (‘P')

Group by A, B;

I found this alternate solution here but Im not really sure how to Implement it in my code :/. Using Pig .15/Hortonworks 2.2.0


Re: Pig Count Distinct Error

can you provide the DDL for the two Hive tables? SHOW CREATE TABLE table_name; could get it quickly

Re: Pig Count Distinct Error



Maybe must assign the data type to count, for example:

1) COUNT (uniqueids) AS id_count: double

2) COUNT (uniqueids) AS id_count: chararray

I usually go for option 2 because pig does not resolve well the double data type when you want to apply a filter.

Greetings and good luck

Don't have an account?
Coming from Hortonworks? Activate your account here