Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to sum distances between data points in a dataframe using scala

How to sum distances between data points in a dataframe using scala

getting an error while i am running the below code

java.lang.ClassCastException: org.apache.spark.sql.Column cannot be cast to java.lang.Double at scala.runtime.BoxesRunTime.unboxToDouble(BoxesRunTime.java:114) at castingtoDouble(<console>:32) at distFrom(<console>:37)

val fullListCombination = joinSiteFacilityData.as("df1").join(joinSiteFacilityData.as("df2"), $"df1.siteId" < $"df2.siteId", "inner").select($"df1.siteId".alias("siteId1"),$"df1.latitude".alias("latitude1"),$"df1.longitude".alias("longitude1"),$"df2.siteId".alias("siteId2"),$"df2.latitude".alias("latitude2"),$"df2.longitude".alias("longitude2")). withColumn("distance",lit(distFrom(toDouble(col("latitude1")),toDouble(col("longitude1")),toDouble(col("latitude2")),toDouble(col("longitude2")))))

def distFrom(lat1: Any, lng1: Any, lat2: Any, lng2: Any): Double = {<br>  val earthRadius = 3959//metersval dLat = Math.toRadians(castingtoDouble(lat2) - castingtoDouble(lat1))<br>  val dLng = Math.toRadians(castingtoDouble(lng2) - castingtoDouble(lng1))<br>  val a = Math.sin(dLat / 2) * Math.sin(dLat / 2) + Math.cos(Math.toRadians(castingtoDouble(lat1))) * Math.cos(Math.toRadians(castingtoDouble(lat2))) * Math.sin(dLng / 2) * Math.sin(dLng / 2)<br>  val c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a))<br>  val dist = (earthRadius * c).toFloat<br>  dist<br>}
Don't have an account?
Coming from Hortonworks? Activate your account here