Created 12-01-2016 01:09 AM
val ebayds = sc.textFile("/user/spark/xbox.csv")
case class Auction(auctionid: String, bid: Float, bidtime: Float, bidder: String, bidderrate: Int, openbid: Float, price: Float)
val ebay = ebayds.map(a=>a.split(",")).map(p=>Auction(p(0),p(1).toFloat,p(2).toFloat,p(3),p(4).toInt,p(5).toFloat,p(6).toFloat)).toDF()
ebay.select("auctionid").distinct.count
am getting error
For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
Created 12-04-2016 06:35 AM
The error seem to be mismatch data type with data set and case class.Check the each columns data type first
Use csv api to read csv file and print schema
Eg:
val ebaydf = sqlcontect.read.format("com.databricks.spark.csv").option("header", "true").option("InferSchema", "true").load(path) ebaydf.printschema()
Created 12-01-2016 11:21 AM
@jayaprakash gadi why don't you implement a companion method in Auction class to handle null values.
Created 12-04-2016 06:35 AM
The error seem to be mismatch data type with data set and case class.Check the each columns data type first
Use csv api to read csv file and print schema
Eg:
val ebaydf = sqlcontect.read.format("com.databricks.spark.csv").option("header", "true").option("InferSchema", "true").load(path) ebaydf.printschema()