Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Converting JSON to Rdd

New Contributor

I am getting a json response, and in my sparkSQL data source, i need to read the data and infer schema for the json and convert in to rdd<ROW>. Is there any class to do that in spark?

Thanks

1 ACCEPTED SOLUTION

Expert Contributor
val dataframe = sqlContext.read.json(<a RDD[String] where each line is JSON object>)

See https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/DataFrameReader.html#json(org.apac...

View solution in original post

10 REPLIES 10

New Contributor

I dont want to read from files. I have json data in a variable coming from http response in my code.

Super Guru

@Akash Mehta

So, even following wont work for you? If not, I think currently there is no other way given we have looked at all other possible options.

//a DataFrame can be created for a JSON dataset represented by
// an RDD[String] storing one JSON object per string.
val anotherPeopleRDD = sc.parallelize(
  """{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}""" :: Nil)
val anotherPeople = sqlContext.read.json(anotherPeopleRDD)

Super Guru

@Akash Mehta Can you do something like this?

dataframe = sqlContext.read.format(“json”).load(your json here)

New Contributor

But "your json here" takes a path and i am having the json from an httpresponse (converted to string).

I need to read from that and infer the schema and convert to rdd<ROW>

Super Guru

load will infer schema and convert to a row. Question is whether it will take an http url. Can you try?

New Contributor

Yes yes load will do that but load requires an input path and i have my json stored in a string variable.

Expert Contributor
val dataframe = sqlContext.read.json(<a RDD[String] where each line is JSON object>)

See https://spark.apache.org/docs/1.6.0/api/java/org/apache/spark/sql/DataFrameReader.html#json(org.apac...

New Contributor

This will output a dataframe and i need RDD[Row]

Expert Contributor