Member since
10-13-2016
1
Post
0
Kudos Received
0
Solutions
10-13-2016
06:34 PM
can we say this difference is only due to the conversion from RDD to dataframe ? because as per apache documentation, dataframe has memory and query optimizer which should outstand RDD I believe if the source is json file, we can directly read into dataframe and it would definitely have good performance compared to RDD and why Sparksql has good performance compared to dataframe for grouping test ? dataframe and sparkSQL should be converted to similare RDD code and has same optimizers
... View more