Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
Spark sort by key with descending order
Labels:
- Labels:
-
Apache Spark
Super Collaborator
Created 10-19-2017 03:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
rdd.sortByKey() sorts in ascending order.
I want to sort in descending order.
I tried rdd.sortByKey("desc") but it did not work
1 ACCEPTED SOLUTION
Guru
Created 10-19-2017 03:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 REPLIES 2
Guru
Created 10-19-2017 03:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try using rdd.sortByKey(false)
This will sort in descending order
Contributor
Created 10-20-2017 06:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try this code
from pyspark import SparkConf, SparkContext from pyspark.sql import SQLContext conf1 = SparkConf().setAppName('sort_desc') sc1 = SparkContext(conf=conf1) sql_context = SQLContext(sc1) csv_file_path = 'emp.csv' employee_rdd = sc1.textFile(csv_file_path).map(lambda line: line.split(',')) print(type(employee_rdd)) employee_rdd_sorted = employee_rdd.sortByKey(ascending= False) employee_df = employee_rdd.toDF(['dept','ctc']) employee_df_sorted = employee_rdd_sorted.toDF(['dept','ctc'])
