Support Questions

prakash_hadoopd · ‎12-01-2016

val ebayds = sc.textFile("/user/spark/xbox.csv")

case class Auction(auctionid: String, bid: Float, bidtime: Float, bidder: String, bidderrate: Int, openbid: Float, price: Float)

val ebay = ebayds.map(a=>a.split(",")).map(p=>Auction(p(0),p(1).toFloat,p(2).toFloat,p(3),p(4).toInt,p(5).toFloat,p(6).toFloat)).toDF()

ebay.select("auctionid").distinct.count

am getting error

For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

senthilkumarP · ‎12-04-2016

The error seem to be mismatch data type with data set and case class.Check the each columns data type first

Use csv api to read csv file and print schema

Eg:

val ebaydf = sqlcontect.read.format("com.databricks.spark.csv").option("header", "true").option("InferSchema", "true").load(path)

ebaydf.printschema()

View solution in original post

rajkumar_singh · ‎12-01-2016

@jayaprakash gadi why don't you implement a companion method in Auction class to handle null values.

senthilkumarP · ‎12-04-2016

The error seem to be mismatch data type with data set and case class.Check the each columns data type first

Use csv api to read csv file and print schema

Eg:

val ebaydf = sqlcontect.read.format("com.databricks.spark.csv").option("header", "true").option("InferSchema", "true").load(path)

ebaydf.printschema()

Cloudera Community

Support Questions

How can i replace or handle null values in dataframe .

how to replace empty string with null in nested js...

Pyspark dataframe: How to replace

Replace Null with white space for not-null column ...

Using JOLT to remove field with null value

Jolt Transformation returning null values

Nifi Lookup CSV values with SQL NULL values

Sqoop import - null values in HDFS files replaced ...

Spark RDDs vs DataFrames vs SparkSQL

Hive CAST functions return NULL values:

How to replace blank rows in pyspark Dataframe?