Support Questions

acondron · ‎01-27-2017

Cannoted display/show/print pivoted dataframe in with PySpark. Although apparently created pivoted dataframe fine, when try to show says AttributeError: 'GroupedData' object has no attribute 'show'.

Here's the code

meterdata = sqlContext.read.format("com.databricks.spark.csv").option("delimiter", ",").option("header", "false").load("/CBIES/meters/") metercols = meterdata.groupBy("C0").pivot("C1")

metercols.show()

Output:

Traceback (most recent call last): File "/tmp/zeppelin_pyspark-8003809301447367155.py", line 239, in <module> eval(compiledCode) File "<string>", line 1, in <module> AttributeError: 'GroupedData' object has no attribute 'show'

tkiss · ‎01-27-2017

After pivoting you need to run an aggregate function (e.g. sum) to get back a DataFrame/Dataset.

After aggregation you'll be able to show() the data.

You can find an excellent overview of pivoting at this website:

https://databricks.com/blog/2016/02/09/reshaping-data-with-pivot-in-apache-spark.html

Cloudera Community

Support Questions

How to display pivoted dataframe with PSark, Pyspark?

Pyspark dataframe: How to replace

Pyspark issue AttributeError: 'DataFrame' object h...

How to transpose a pyspark dataframe?

Spark RDDs vs DataFrames vs SparkSQL

How to Create an Iceberg Table with PySpark in Clo...

Pyspark: Table Dataframe returning empty records f...

Using VirtualEnv with PySpark

Using VirtualEnv with PySpark

Running PySpark with Conda Env

How to replace blank rows in pyspark Dataframe?