Reply
Explorer
Posts: 7
Registered: ‎05-29-2014

Doc bug regarding NDV and COUNT(DISTINCT)

Hey there, noticed a doc bug on the page at 

http://www.cloudera.com/content/cloudera-content/cloudera-docs/Impala/latest/Installing-and-Using-Im...

 

This text:

If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by specifying COUNT(NDV(column)); a query can contain multiple instances of COUNT(NDV(column)).

 

Should be

If you do not need precise accuracy, you can produce an estimate of the distinct values for a column by specifying NDV(column); a query can contain multiple instances of NDV(column).

 

COUNT(NDV(column)) will return an error like this:

 

> select count(ndv(user_id)) from table;

Error: AnalysisException: aggregate function cannot contain aggregate parameters: count(ndv(user_id)) (state=HY000,code=0)

 

Cloudera Employee
Posts: 27
Registered: ‎09-27-2013

Re: Doc bug regarding NDV and COUNT(DISTINCT)

Thanks for letting us know. This will be fixed in the next doc rev.