Quantcast
Channel: IT瘾博客推荐
Viewing all articles
Browse latest Browse all 532

ElasticSearch位置搜索 - Spring , Hadoop, Spark , BI , ML - CSDN博客

$
0
0

在ElasticSearch中,地理位置通过 geo_point这个数据类型来支持。地理位置的数据需要提供经纬度信息,当经纬度不合法时,ES会拒绝新增文档。这种类型的数据支持距离计算,范围查询等。在底层,索引使用 Geohash实现。

1、创建索引

PUT创建一个索引 cn_large_cities, mapping为city:

{"mappings":{"city":{"properties":{"city":{"type":"string"},"state":{"type":"string"},"location":{"type":"geo_point"}}}}}

geo_point类型必须显示指定,ES无法从数据中推断。在ES中,位置数据可以通过对象,字符串,数组三种形式表示,分别如下:

#"lat,lon""location":"40.715,-74.011""location": {"lat":40.715,"lon":-74.011}

# [lon ,lat]"location":[-74.011,40.715]

POST下面4条测试数据:

{"city":"Beijing", "state":"BJ","location":{"lat":"39.91667", "lon":"116.41667"}}

{"city":"Shanghai", "state":"SH","location":{"lat":"34.50000", "lon":"121.43333"}}

{"city":"Xiamen", "state":"FJ","location":{"lat":"24.46667", "lon":"118.10000"}}

{"city":"Fuzhou", "state":"FJ","location":{"lat":"26.08333", "lon":"119.30000"}}

{"city":"Guangzhou", "state":"GD","location":{"lat":"23.16667", "lon":"113.23333"}}

查看全部文档:

curl -XGET"http://localhost:9200/cn_large_cities/city/_search?pretty=true"

返回全部的5条数据,score均为1:

这里写图片描述

2、位置过滤

ES中有4中位置相关的过滤器,用于过滤位置信息:

  • geo_distance: 查找距离某个中心点距离在一定范围内的位置
  • geo_bounding_box: 查找某个长方形区域内的位置
  • geo_distance_range: 查找距离某个中心的距离在min和max之间的位置
  • geo_polygon: 查找位于多边形内的地点。

geo_distance

该类型过滤器查找的范围如下图:

下面是一个查询例子:

{"query":{"filtered":{"filter":{"geo_distance":"1km","location":{"lat":40.715,"lon":-73.988}}}}}

以下查询,查找距厦门500公里以内的城市:

{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"500km","location" :{"lat" :24.46667,"lon" :118.10000}}}}}}

geo_distance_range

{"query":{"filtered":{"filter":{"geo_distance_range":{"gte":"1km","lt":"2km","location":{"lat":40.715,"lon":-73.988}}}}}

geo_bounding_box

{"query":{"filtered":{"filter":{"geo_bounding_box":{"location":{"top_left":{"lat":40.8,"lon":-74.0},"bottom_right":{"lat":40.715,"lon":-73.0}}}}}}

3、按距离排序

接着我们按照距离厦门远近查找:

{"sort" :[
      {"_geo_distance" :{"location" :{"lat" :24.46667,"lon" :118.10000}, "order" :"asc","unit" :"km"}}
  ],"query":{"filtered" :{"query" :{"match_all" :{}}}}}

结果如下,依次是厦门、福州、广州…。符合我们的常识:

{"took":8,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5,"max_score":null,"hits":[
      {"_index":"us_large_cities","_type":"city","_id":"AVaiSGXXjL0tfmRppc_p","_score":null,"_source":{"city":"Xiamen","state":"FJ","location":{"lat":"24.46667","lon":"118.10000"}},"sort":[0]},
      {"_index":"us_large_cities","_type":"city","_id":"AVaiSSuNjL0tfmRppc_r","_score":null,"_source":{"city":"Fuzhou","state":"FJ","location":{"lat":"26.08333","lon":"119.30000"}},"sort":[216.61105485607183]},
      {"_index":"us_large_cities","_type":"city","_id":"AVaiSd02jL0tfmRppc_s","_score":null,"_source":{"city":"Guangzhou","state":"GD","location":{"lat":"23.16667","lon":"113.23333"}},"sort":[515.9964950041397]},
      {"_index":"us_large_cities","_type":"city","_id":"AVaiR7_5jL0tfmRppc_o","_score":null,"_source":{"city":"Shanghai","state":"SH","location":{"lat":"34.50000","lon":"121.43333"}},"sort":[1161.512141925948]},
      {"_index":"us_large_cities","_type":"city","_id":"AVaiRwLUjL0tfmRppc_n","_score":null,"_source":{"city":"Beijing","state":"BJ","location":{"lat":"39.91667","lon":"116.41667"}},"sort":[1725.4543712286697]}
    ]}}

结果返回的sort字段是指公里数。加上限制条件,只返回最近的一个城市:

{"from":0,"size":1,"sort" :[
      {"_geo_distance" :{"location" :{"lat" :24.46667,"lon" :118.10000}, "order" :"asc","unit" :"km"}}
  ],"query":{"filtered" :{"query" :{"match_all" :{}}}}}

4、地理位置聚合

ES提供了3种位置聚合:

  • geo_distance: 根据到特定中心点的距离聚合
  • geohash_grid: 根据Geohash的单元格(cell)聚合
  • geo_bounds: 根据区域聚合

4.1 geo_distance聚合

下面这个查询根据距离厦门的距离来聚合,返回0-500,500-8000km的聚合:

{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"per_ring":{"geo_distance":{"field":"location","unit":"km","origin":{"lat" :24.46667,"lon" :118.10000},"ranges":[
                    {"from":0, "to":500},
                    {"from":500, "to":8000}
                ]}}}}

返回的聚合结果如下;

"aggregations": {"per_ring":{"buckets":[
        {"key":"*-500.0","from":0,"from_as_string":"0.0","to":500,"to_as_string":"500.0","doc_count":2},
        {"key":"500.0-8000.0","from":500,"from_as_string":"500.0","to":8000,"to_as_string":"8000.0","doc_count":3}
      ]}}

可以看到,距离厦门0-500km的城市有2个,500-8000km的有3个。

4.2 geohash_grid聚合

该聚合方式根据geo_point数据对应的geohash值所在的cell进行聚合,cell的划分精度通过 precision属性来控制,精度是指cell划分的次数。

{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"grid_agg":{"geohash_grid":{"field":"location","precision":2}}}}

聚合结果如下:

"aggregations": {"grid_agg":{"buckets":[
        {"key":"ws","doc_count":3},
        {"key":"wx","doc_count":1},
        {"key":"ww","doc_count":1}
      ]}}

可以看到,有3个城市的的geohash值为ws。将精度提高到5,聚合结果如下:

"aggregations": {"grid_agg":{"buckets":[
        {"key":"wx4g1","doc_count":1},
        {"key":"wwnk7","doc_count":1},
        {"key":"wssu6","doc_count":1},
        {"key":"ws7gp","doc_count":1},
        {"key":"ws0eb","doc_count":1}
      ]}}

4.3 geo_bounds聚合

这个聚合操作计算能够覆盖所有查询结果中geo_point的最小区域,返回的是覆盖所有位置的最小矩形:

{"query":{"filtered":{"filter":{"geo_distance" :{"distance" :"10000km","location" :{"lat" :24.46667,"lon" :118.10000}}}}},"aggs":{"map-zoom":{"geo_bounds":{"field":"location"}}}}

结果如下:

"aggregations": {"map-zoom":{"bounds":{"top_left":{"lat":39.91666993126273,"lon":113.2333298586309},"bottom_right":{"lat":23.16666992381215,"lon":121.43332997336984}}}}

也就是说,这两个点构成的矩形能够包含所有到厦门距离10000km的区域。我们把距离调整为500km,此时覆盖这些城市的矩形如下:

"aggregations": {"map-zoom":{"bounds":{"top_left":{"lat":26.083329990506172,"lon":118.0999999679625},"bottom_right":{"lat":24.46666999720037,"lon":119.29999999701977}}}}

5、参考资料

图解 MongoDB 地理位置索引的实现原理: http://blog.nosqlfan.com/html/1811.html
Geopoint数据类型: https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html


Viewing all articles
Browse latest Browse all 532

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>