Community Articles

Find and share helpful community-sourced technical articles

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Cloudera Community
- :
- Support
- :
- Community Articles
- :
- Geo Distance calculations in Hive and Java

Announcements

Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

- Article History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

- Article History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

Cloudera Employee

Created on 03-25-2016 07:01 PM - edited 08-17-2019 12:57 PM

March 23, 2016

Eran Orgad

Geographical distance is the distance measured along the surface of the earth.

The distances between points are defined by geographical coordinates (Geo Location cords) in terms of latitude and longitude.

Each pair represent a location as:

Lexington MA:

Latitude : 42.4428, Longitude: -71.2317

Mountain View CA:

Latitude: 37.405990600586, Longitude: -122.07851409912

Distance: 2676.09225228497 Miles

To be able to calculate the distance on Hortonworks HDP using Hive (and Tez) let’s generate some sample data that includes some pairs of Geolocation coordinates.

Data file called: distance1.csv

42.28,-71.87,42.28,-71.86

42.00,-71.87,42.28,-71.11

42.28,-71.87,42.28,-71.86

42.28,-71.87,42.28,-71.22

42.00,-71.87,42.28,-72.33

42.28,-71.87,42.28,-70.44

42.00,-71.87,42.28,-71.55

42.28,-71.87,41.28,-71.66

42.00,-71.87,43.28,-71.77

42.28,-71.87,44.28,-71.88

42.00,-71.87,45.28,-71.99

42.28,-71.87,46.11,-71.00

42.00,-71.87,47.22,-71.00

42.4428,-71.2317,37.405990600586,-122.07851409912

….

We can create a schema to be able to read the content in Hive

Here is the table I created:

hive> show create table distancecalc;

OK

CREATE EXTERNAL TABLE distancecalc(

src_lat double,

src_long double,

dest_lat double,

dest_long double)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ','

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

'hdfs://sandbox.hortonworks.com:8020/tmp/dist2'

TBLPROPERTIES (

'COLUMN_STATS_ACCURATE'='false',

'numFiles'='1',

'numRows'='-1',

'rawDataSize'='-1',

'totalSize'='338',

'transient_lastDdlTime'='1458866694')

After placing my distance1.csv in the /tmp/dist2/ directory I can query the content in hive:

The following query will produce a calculation of distance with every given pair of geo location coordinates:

select src_lat, src_long, dest_lat, dest_long,

60*1.1515*(180*(acos(((sin(radians(src_lat))*sin(radians(dest_lat))) + (cos(radians(src_lat))*cos(radians(dest_lat))*cos(radians(src_long-dest_long))))))/PI()) as distancecalc

from distancecalc;

screen-shot-2016-03-25-at-122643-pm.png

Last entry is the distance between Lexington MA and Mountain View CA.

Java code to calculate the distance:

/*

* Class to calculate geo-location distance

* eorgad - Hortonworks.com 9/30/2014

*/

package geodistance;

import java.sql.SQLException;

public class calcGeoDistance {

public static double distance(double lat1, double lon1, double lat2, double lon2, String string) {

double theta = lon1 - lon2;

double dist = Math.sin(deg2rad(lat1)) * Math.sin(deg2rad(lat2)) + Math.cos(deg2rad(lat1)) * Math.cos(deg2rad(lat2)) * Math.cos(deg2rad(theta));

dist = Math.acos(dist);

dist = rad2deg(dist);

dist = dist * 60 * 1.1515;

if (string == "K") {

dist = dist * 1.609344;

} else if (string == "N") {

dist = dist * 0.8684;

}

return (dist);

}

/*###################################################################*/

/*# This function converts decimal degrees to radians #*/

/*###################################################################*/

privatestaticdouble deg2rad(double deg) {

return (deg * Math.PI / 180.0);

}

/*###################################################################*/

/*# This function converts radians to decimal degrees #*/

/*###################################################################*/

privatestaticdouble rad2deg(double rad) {

return (rad * 180 / Math.PI);

}

**public****static****void** main(String[] args) **throws** SQLException {

System.*out*.println(*distance*(42.4428,
-71.2317, 37.405990600586, -122.07851409912, "M") + " Miles\n");

// __lexington__ to mountain view

// 42.4428 -71.2317 37.405990600586 -122.07851409912

}

}

8,084 Views

Comments

Guru

Created on 04-01-2016 06:08 PM

- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Email to a Friend
- Report Inappropriate Content

Coming from Hortonworks? Activate your account here

Version history

Top Kudoed Authors

User | Count |
---|---|

758 | |

379 | |

316 | |

309 | |

268 |