在ElasticSearch中,地理位置通过 geo_point
这个数据类型来支持。地理位置的数据需要提供经纬度信息,当经纬度不合法时,ES会拒绝新增文档。这种类型的数据支持距离计算,范围查询等。在底层,索引使用 Geohash实现。
1、创建索引
PUT创建一个索引 cn_large_cities
, mapping
为city:
{"mappings":{"city":{"properties":{"city":{"type":"string"},"state":{"type":"string"},"location":{"type":"geo_point"}}}}}
geo_point类型必须显示指定,ES无法从数据中推断。在ES中,位置数据可以通过对象,字符串,数组三种形式表示,分别如下:
#"lat,lon""location":"40.715,-74.011""location": {"lat":40.715,"lon":-74.011}
# [lon ,lat]"location":[-74.011,40.715]
POST下面4条测试数据:
{"city":"Beijing", "state":"BJ","location":{"lat":"39.91667", "lon":"116.41667"}}
{"city":"Shanghai", "state":"SH","location":{"lat":"34.50000", "lon":"121.43333"}}
{"city":"Xiamen", "state":"FJ","location":{"lat":"24.46667", "lon":"118.10000"}}
{"city":"Fuzhou", "state":"FJ","location":{"lat":"26.08333", "lon":"119.30000"}}
{"city":"Guangzhou", "state":"GD","location":{"lat":"23.16667", "lon":"113.23333"}}
查看全部文档:
curl -XGET"http://localhost:9200/cn_large_cities/city/_search?pretty=true"
返回全部的5条数据,score均为1:
2、位置过滤
ES中有4中位置相关的过滤器,用于过滤位置信息:
- geo_distance: 查找距离某个中心点距离在一定范围内的位置
- geo_bounding_box: 查找某个长方形区域内的位置
- geo_distance_range: 查找距离某个中心的距离在min和max之间的位置
- geo_polygon: 查找位于多边形内的地点。
geo_distance
该类型过滤器查找的范围如下图:
下面是一个查询例子:
{"query":{"filtered":{"filter":{"geo_distance":"1km","location":{"lat":40.715,"lon":-73.988}}}}}
以下查询,查找距厦门500公里以内的城市:
{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"500km","location" :{"lat" :24.46667,"lon" :118.10000}}}}}}
geo_distance_range
{"query":{"filtered":{"filter":{"geo_distance_range":{"gte":"1km","lt":"2km","location":{"lat":40.715,"lon":-73.988}}}}}
geo_bounding_box
{"query":{"filtered":{"filter":{"geo_bounding_box":{"location":{"top_left":{"lat":40.8,"lon":-74.0},"bottom_right":{"lat":40.715,"lon":-73.0}}}}}}
3、按距离排序
接着我们按照距离厦门远近查找:
{"sort" :[
{"_geo_distance" :{"location" :{"lat" :24.46667,"lon" :118.10000}, "order" :"asc","unit" :"km"}}
],"query":{"filtered" :{"query" :{"match_all" :{}}}}}
结果如下,依次是厦门、福州、广州…。符合我们的常识:
{"took":8,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5,"max_score":null,"hits":[
{"_index":"us_large_cities","_type":"city","_id":"AVaiSGXXjL0tfmRppc_p","_score":null,"_source":{"city":"Xiamen","state":"FJ","location":{"lat":"24.46667","lon":"118.10000"}},"sort":[0]},
{"_index":"us_large_cities","_type":"city","_id":"AVaiSSuNjL0tfmRppc_r","_score":null,"_source":{"city":"Fuzhou","state":"FJ","location":{"lat":"26.08333","lon":"119.30000"}},"sort":[216.61105485607183]},
{"_index":"us_large_cities","_type":"city","_id":"AVaiSd02jL0tfmRppc_s","_score":null,"_source":{"city":"Guangzhou","state":"GD","location":{"lat":"23.16667","lon":"113.23333"}},"sort":[515.9964950041397]},
{"_index":"us_large_cities","_type":"city","_id":"AVaiR7_5jL0tfmRppc_o","_score":null,"_source":{"city":"Shanghai","state":"SH","location":{"lat":"34.50000","lon":"121.43333"}},"sort":[1161.512141925948]},
{"_index":"us_large_cities","_type":"city","_id":"AVaiRwLUjL0tfmRppc_n","_score":null,"_source":{"city":"Beijing","state":"BJ","location":{"lat":"39.91667","lon":"116.41667"}},"sort":[1725.4543712286697]}
]}}
结果返回的sort字段是指公里数。加上限制条件,只返回最近的一个城市:
{"from":0,"size":1,"sort" :[
{"_geo_distance" :{"location" :{"lat" :24.46667,"lon" :118.10000}, "order" :"asc","unit" :"km"}}
],"query":{"filtered" :{"query" :{"match_all" :{}}}}}
4、地理位置聚合
ES提供了3种位置聚合:
- geo_distance: 根据到特定中心点的距离聚合
- geohash_grid: 根据Geohash的单元格(cell)聚合
- geo_bounds: 根据区域聚合
4.1 geo_distance聚合
下面这个查询根据距离厦门的距离来聚合,返回0-500,500-8000km的聚合:
{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"per_ring":{"geo_distance":{"field":"location","unit":"km","origin":{"lat" :24.46667,"lon" :118.10000},"ranges":[
{"from":0, "to":500},
{"from":500, "to":8000}
]}}}}
返回的聚合结果如下;
"aggregations": {"per_ring":{"buckets":[
{"key":"*-500.0","from":0,"from_as_string":"0.0","to":500,"to_as_string":"500.0","doc_count":2},
{"key":"500.0-8000.0","from":500,"from_as_string":"500.0","to":8000,"to_as_string":"8000.0","doc_count":3}
]}}
可以看到,距离厦门0-500km的城市有2个,500-8000km的有3个。
4.2 geohash_grid聚合
该聚合方式根据geo_point数据对应的geohash值所在的cell进行聚合,cell的划分精度通过 precision
属性来控制,精度是指cell划分的次数。
{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"grid_agg":{"geohash_grid":{"field":"location","precision":2}}}}
聚合结果如下:
"aggregations": {"grid_agg":{"buckets":[
{"key":"ws","doc_count":3},
{"key":"wx","doc_count":1},
{"key":"ww","doc_count":1}
]}}
可以看到,有3个城市的的geohash值为ws。将精度提高到5,聚合结果如下:
"aggregations": {"grid_agg":{"buckets":[
{"key":"wx4g1","doc_count":1},
{"key":"wwnk7","doc_count":1},
{"key":"wssu6","doc_count":1},
{"key":"ws7gp","doc_count":1},
{"key":"ws0eb","doc_count":1}
]}}
4.3 geo_bounds聚合
这个聚合操作计算能够覆盖所有查询结果中geo_point的最小区域,返回的是覆盖所有位置的最小矩形:
{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"map-zoom":{"geo_bounds":{"field":"location"}}}}
结果如下:
"aggregations": {"map-zoom":{"bounds":{"top_left":{"lat":39.91666993126273,"lon":113.2333298586309},"bottom_right":{"lat":23.16666992381215,"lon":121.43332997336984}}}}
也就是说,这两个点构成的矩形能够包含所有到厦门距离10000km的区域。我们把距离调整为500km,此时覆盖这些城市的矩形如下:
"aggregations": {"map-zoom":{"bounds":{"top_left":{"lat":26.083329990506172,"lon":118.0999999679625},"bottom_right":{"lat":24.46666999720037,"lon":119.29999999701977}}}}
5、参考资料
图解 MongoDB 地理位置索引的实现原理: http://blog.nosqlfan.com/html/1811.html
Geopoint数据类型: https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html