Hi @Shreya Gupta,
Immutable Resilient Distributed Dataset(RDD) or DataSets/DataFrames [again that been stored as RDDs under the hood]
so based on the API call you issued it get represented accordingly.
val texFileRDD = sc.textFile("README.MD") # represents the data in RDD textFileDS.getClass res8: Class[_ <: org.apache.spark.sql.Dataset[String]] = class org.apache.spark.sql.Dataset val textFileDS = sqlcontext.read.textFile("README.md") # represent the data in Dataset. texFileRDD.getClass res10: Class[_ <: org.apache.spark.rdd.RDD[String]] = class org.apache.spark.rdd.MapPartitionsRDD
to know the format of the datatype stored in a variable. getClass method will help
more on this can be found at https://spark.apache.org/docs/latest/quick-start.html