Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I'm not able to create data frame with the info from link . I'm seeing error as NonviableAltException(26@[ ])

avatar
New Contributor

i'm not able to create data frames with the information provided in the below link http://spark.apache.org/docs/latest/sql-programming-guide.html#data-sources I'm using HDP2.3-Pig & Hive Rev6 zip latest version may i know what is the other way to create data frames checked different forums but unable to get it done need your support to come over the issue.

1 ACCEPTED SOLUTION

avatar

@Vijay Kanth

I assume you are running scala/spark code? Also, just as an FYI, the link that you shared above is for the latest version of Spark (which is currently 2.0.1). HDP 2.3 is an older version of the Hortonworks platform and it runs Spark 1.4.1. If you want to use the latest version of Spark, you can upgrade to HDP 2.5.

But you should not need to upgrade to create a dataframe within Spark. Here's spark/scala code that I used within HDP 2.3 to generate a sample dataframe:

val df = sqlContext.createDataFrame(Seq(
    ("1111", 10000,"M"),
    ("2222", 20000,"F"),
    ("3333", 30000,"M"),
    ("4444", 40000,"F")
    )).toDF("id", "income","gender")

df.show()

Try this out and let me know if it works.

View solution in original post

2 REPLIES 2

avatar

@Vijay Kanth

I assume you are running scala/spark code? Also, just as an FYI, the link that you shared above is for the latest version of Spark (which is currently 2.0.1). HDP 2.3 is an older version of the Hortonworks platform and it runs Spark 1.4.1. If you want to use the latest version of Spark, you can upgrade to HDP 2.5.

But you should not need to upgrade to create a dataframe within Spark. Here's spark/scala code that I used within HDP 2.3 to generate a sample dataframe:

val df = sqlContext.createDataFrame(Seq(
    ("1111", 10000,"M"),
    ("2222", 20000,"F"),
    ("3333", 30000,"M"),
    ("4444", 40000,"F")
    )).toDF("id", "income","gender")

df.show()

Try this out and let me know if it works.

avatar
New Contributor

@Dan Zaratsian

Thanks for the above it worked for me I was able to create the dataframe. My approach to access .json file is wrong or any thing that I need to try with.