Support Questions

Find answers, ask questions, and share your expertise

Converting JSON to Rdd

avatar
Explorer

I am getting a json response, and in my sparkSQL data source, i need to read the data and infer schema for the json and convert in to rdd<ROW>. Is there any class to do that in spark?

Thanks

1 ACCEPTED SOLUTION

avatar
Super Collaborator
val dataframe = sqlContext.read.json(<a RDD[String] where each line is JSON object>)

See https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/DataFrameReader.html#json(org.apac...

View solution in original post

10 REPLIES 10

avatar
Explorer

I dont want to read from files. I have json data in a variable coming from http response in my code.

avatar
Super Guru

@Akash Mehta

So, even following wont work for you? If not, I think currently there is no other way given we have looked at all other possible options.

//a DataFrame can be created for a JSON dataset represented by
// an RDD[String] storing one JSON object per string.
val anotherPeopleRDD = sc.parallelize(
  """{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}""" :: Nil)
val anotherPeople = sqlContext.read.json(anotherPeopleRDD)

avatar
Super Guru

@Akash Mehta Can you do something like this?

dataframe = sqlContext.read.format(“json”).load(your json here)

avatar
Explorer

But "your json here" takes a path and i am having the json from an httpresponse (converted to string).

I need to read from that and infer the schema and convert to rdd<ROW>

avatar
Super Guru

load will infer schema and convert to a row. Question is whether it will take an http url. Can you try?

avatar
Explorer

Yes yes load will do that but load requires an input path and i have my json stored in a string variable.

avatar
Super Collaborator
val dataframe = sqlContext.read.json(<a RDD[String] where each line is JSON object>)

See https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/DataFrameReader.html#json(org.apac...

avatar
Explorer

This will output a dataframe and i need RDD[Row]