Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

import ibis gives error "AttributeError: type object 'PandasUDFType' has no attribute 'GROUPED_AGG'"

avatar
New Contributor

Hi All,

 

When I am trying sample code provided on Cloudera documentation  here. It gives me error message saying that Attributes error.

 

Error: 

AttributeError: type object 'PandasUDFType' has no attribute 'GROUPED_AGG'
AttributeError                            Traceback (most recent call last)
in engine
----> 1 import ibis

/home/cdsw/.local/lib/python3.6/site-packages/ibis/__init__.py in <module>()
     58 with suppress(ImportError):
     59     # pip install ibis-framework[spark]
---> 60     import ibis.spark.api as spark  # noqa: F401
     61 
     62 with suppress(ImportError):

/home/cdsw/.local/lib/python3.6/site-packages/ibis/spark/api.py in <module>()
      2 from ibis.spark.client import SparkClient
      3 from ibis.spark.compiler import dialect  # noqa: F401
----> 4 from ibis.spark.udf import udf  # noqa: F401
      5 
      6 

/home/cdsw/.local/lib/python3.6/site-packages/ibis/spark/udf.py in <module>()
    123 
    124 
--> 125 class SparkPandasAggregateUDF(SparkPandasUDF):
    126     base_class = SparkUDAFNode
    127     pandas_udf_type = f.PandasUDFType.GROUPED_AGG

/home/cdsw/.local/lib/python3.6/site-packages/ibis/spark/udf.py in SparkPandasAggregateUDF()
    125 class SparkPandasAggregateUDF(SparkPandasUDF):
    126     base_class = SparkUDAFNode
--> 127     pandas_udf_type = f.PandasUDFType.GROUPED_AGG
    128 
    129 

AttributeError: type object 'PandasUDFType' has no attribute 'GROUPED_AGG'

Any help to resolve this issue?

 

Thanks,

CRP 

2 REPLIES 2

avatar
New Contributor

This issue has been open for almost 1 year.

Is it possible to connect to Cloudera Impala with Clouder tools ?

How is this done ?

Can anybody connect through CDSW to Impala and run SQL ?

avatar
Master Guru

@cr @PowerofAI You might have to make sue that Impala packages is installed and then import the UDF something like this may be: 

 

import pandas as pdfrom pyspark.sql.functions import pandas_udf, PandasUDFType
from pyspark.sql import Window
df = spark.createDataFrame(
    [(1, 1.0), (1, 2.0), (2, 3.0), (2, 5.0), (2, 10.0)], ("id", "v"))

@pandas_udf("double", PandasUDFType.GROUPED_AGG)
def pandas_mean(v😞
    return v.sum()
df.select(pandas_mean(df['v'])).show()df.groupby("id").agg(pandas_mean(df['v'])).show()df.select(pandas_mean(df['v']).over(Window.partitionBy('id'))).show()

Also we have to make sure that 

pip install ibis-framework
pip install imapala

is there that might causing the issue. 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.