Member since
08-05-2018
73
Posts
0
Kudos Received
0
Solutions
08-13-2018
11:05 AM
The syntax error is with 'currentdate'
... View more
08-13-2018
11:04 AM
Sandeep, Thanks for reaching out. I'm getting the following error from the import function Append ResultsClear Results File "<ipython-input-7-3dab170099f6>", line 3 import datetime currentdate = datetime.datetime.now().strftime("%Y-%m-%d") ^SyntaxError: invalid syntax
... View more
08-13-2018
09:42 AM
Hello community, I have created the following pyspark query: from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
counts = spark.sql("""SELECT
FirstName
,LastName
,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
counts.coalesce(1).write.csv("/home/packt/Downloads/myresults3.csv") I would like to add the current date and time to the file called myresults3. I think the code would look something like the following: counts.coalesce(1).write.csvCONCAT("/home/packt/Downloads/'myresults3'-CURRENTDATE.csv") I'm sure I'm way off the mark with the above attempt, but I'm sure you can see what I'm trying to achieve. Any help will be appreciated. Cheers Carlton
... View more
Labels:
- Labels:
-
Apache Spark
08-06-2018
09:02 PM
Is there a way to get the results with the header info?
... View more
08-06-2018
08:56 PM
Felix, thank you so much. It worked like a dream
... View more
08-06-2018
11:32 AM
Hello community, The output from the pyspark query below produces the following output The pyspark query is as follows: #%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/HumanResources_vEmployeeDepartment.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('HumanResources_vEmployeeDepartment')
myresults = spark.sql("""SELECT
FirstName
,LastName
,JobTitle
FROM HumanResources_vEmployeeDepartment
ORDER BY FirstName, LastName DESC""")
myresults.show() Can someone show me how to save the results to a text / csv file ( or any file please) Thanks Carlton
... View more
Labels:
- Labels:
-
Apache Spark
08-05-2018
05:15 PM
ok, as I'm not getting much assistance with my original question I thought I would try and figure out the problem myself. So I rewrote the pyspark.sql as follows: #%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('ops').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/Person_Person.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('Person_Person')
myresults = spark.sql("""SELECT
PersonType
,COUNT(PersonType) AS `Person Count`
FROM Person_Person
GROUP BY PersonType""")
myresults.collect()
result = myresults.collect()
result
result.saveAsTextFile("test") However, I'm now getting the following error message: AttributeError: 'list' object has no attribute 'saveAsTextFile' I think this could be an easier situation to help resolve. So, if someone could help resolve this issue that would be most appreciated Thanks
... View more
08-05-2018
02:41 AM
Hello community, My first post here, so please let me know if I'm not following protocol. I have written a pyspark.sql query as shown below. I would like the query results to be sent to a textfile but I get the error: AttributeError: 'DataFrame' object has no attribute 'saveAsTextFile' Can someone take a look at the code and let me know where I'm going wrong: #%%
import findspark
findspark.init('/home/packt/spark-2.1.0-bin-hadoop2.7')
from pyspark.sql import SparkSession
def main():
spark = SparkSession.builder.appName('aggs').getOrCreate()
df = spark.read.csv('/home/packt/Downloads/Spark_DataFrames/sales_info.csv',inferSchema=True,header=True)
df.createOrReplaceTempView('sales_info')
example8 = spark.sql("""SELECT
*
FROM sales_info
ORDER BY Sales DESC""")
example8.saveAsTextFile("juyfd")
main() Any help would be appreciated carlton
... View more
Labels:
- Labels:
-
Apache Spark
02-01-2018
10:11 PM
Hi Jay, can you please let me know why I'm suddenly not able to access the Sandbox on port 2222? I was able before, but now I can't.
... View more