Support Questions

Find answers, ask questions, and share your expertise

Pyspark issue AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile'

Explorer

Hello community, 

 

My first post here, so please let me know if I'm not following protocol.

 

I have written a pyspark.sql query as shown below. I would like the query results to be sent to a textfile but I get the error:

 

AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile'

 

Can someone take a look at the code and let me know where I'm going wrong:

 

 

#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession

def main():

  spark = SparkSession.builder.appName('aggs').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/sales_info.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('sales_info')

example8 = spark.sql("""SELECT
    *
FROM sales_info
ORDER BY Sales DESC""")
example8.saveAsTextFile("juyfd")

main()

 Any help would be appreciated

 

carlton

1 ACCEPTED SOLUTION

Contributor

As the error message states, the object, either a DataFrame or List does not have the saveAsTextFile() method.

 

result.write.save() or result.toJavaRDD.saveAsTextFile() shoud do the work, or you can refer to DataFrame or RDD api:

 

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.sql.DataFrameWriter

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.RDD

View solution in original post

2 REPLIES 2

Explorer

ok, as I'm not getting much assistance with my original question I thought I would try and figure out the problem myself. So I rewrote the pyspark.sql as follows:

 

 
#%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('Person_Person')
myresults = spark.sql("""SELECT
  PersonType
 ,COUNT(PersonType) AS `Person Count`
FROM Person_Person
GROUP BY PersonType""")
myresults.collect()
result = myresults.collect()
result
result.saveAsTextFile("test")
However, I'm now getting the following error message:
  AttributeError: 'list' object has no attribute 'saveAsTextFile'
 
I think this could be an easier situation to help resolve.
 
So, if someone could help resolve this issue that would be most appreciated
 
Thanks
 

Contributor

As the error message states, the object, either a DataFrame or List does not have the saveAsTextFile() method.

 

result.write.save() or result.toJavaRDD.saveAsTextFile() shoud do the work, or you can refer to DataFrame or RDD api:

 

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.sql.DataFrameWriter

https://spark.apache.org/docs/2.1.0/api/scala/index.html#org.apache.spark.rdd.RDD

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.