文章目录

  • 01. 数据准备
  • 02. ElasticSearch 如何查询所有文档?
  • 03. ElasticSearch 如何指定搜索结果的条数?
  • 04. ElasticSearch 分页查询方式有哪些?
  • 05. ElasticSearch 如何实现 from+size 分页查询?
  • 06. ElasticSearch 如何实现 searchAfter 分页查询?
  • 07. ElasticSearch 如何实现 scroll 分页查询?
  • 08. ElasticSearch 深分页是什么?
  • 09. ElasticSearch 分页查询的最大限制是多少?
  • 10. ElasticSearch 如何解除分页查询的限制?
  • 11. ElasticSearch 查询文档总命中数最大限制为多少?
  • 12. ElasticSearch 如何解除查询文档总命中数的限制?
  • 13. ElasticSearch 分页查询的性能优化有哪些?
  • 14. SpringBoo整合ES实现:from+size 分页查询?
  • 15. SpringBoo整合ES实现:searchAfetr 分页查询?
  • 16. SpringBoo整合ES实现:scroll 分页查询?

01. 数据准备

ElasticSearch 向 my_index 索引中索引了 12 条文档:

PUT /my_index/_doc/1
{"title": "文雅酒店","content": "青岛","price": 556
}PUT /my_index/_doc/2
{"title": "金都嘉怡假日酒店","content": "北京","price": 337
}PUT /my_index/_doc/3
{"title": "金都欣欣酒店","content": "天津","price": 200
}PUT /my_index/_doc/4
{"title": "金都酒店","content": "上海","price": 300
}PUT /my_index/_doc/5
{"title": "自如酒店","content": "南京","price": 400
}PUT /my_index/_doc/6
{"title": "如家酒店","content": "杭州","price": 500
}PUT /my_index/_doc/7
{"title": "非常酒店","content": "合肥","price": 600
}PUT /my_index/_doc/8
{"title": "金都酒店","content": "淮北","price": 700
}PUT /my_index/_doc/9
{"title": "金都酒店","content": "淮南","price": 900
}PUT /my_index/_doc/10
{"title": "丽舍酒店","content": "阜阳","price": 1000
}PUT /my_index/_doc/11
{"title": "文轩酒店","content": "蚌埠","price": 1020
}PUT /my_index/_doc/12
{"title": "大理酒店","content": "长沙","price": 1100
}

02. ElasticSearch 如何查询所有文档?

ElasticSearch 查询所有文档

GET /my_index/_search

根据查询结果可以看出,集群中总共有12个文档,hits.total.value=12, 但是在 hits 数组中只有 10 个文档。如何才能看到其他的文档?

{"took" : 688,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"title" : "金都嘉怡假日酒店","content" : "北京","price" : 337}},{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"title" : "金都欣欣酒店","content" : "天津","price" : 200}},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 1.0,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 1.0,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 1.0,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 1.0,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000}}]}
}

03. ElasticSearch 如何指定搜索结果的条数?

Elasticsearch 接受 fromsize 参数:

from:显示应该跳过的初始结果数量,默认是0
size:显示应该返回的结果数量,默认是10

from 和 size 参数的默认值分别为 0 和 10,因此如果不指定这两个参数,将返回前 10 条记录,这也是为什么集群中总共有12个文档,hits.total.value=12, 但是在 hits 数组中只有 10 个文档的原因。

如果我们想返回更多的结果数量,可以通过size参数来指定:

GET /my_index/_search
{"size": 15
}

集群中总共有12条文档。size=15 会把集群中所有的文档返回:

{"took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"title" : "金都嘉怡假日酒店","content" : "北京","price" : 337}},{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"title" : "金都欣欣酒店","content" : "天津","price" : 200}},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 1.0,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 1.0,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 1.0,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 1.0,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 1.0,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 1.0,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 1.0,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}}]}
}

04. ElasticSearch 分页查询方式有哪些?

使用 from 和 size 参数来实现分页查询。
使用 scroll 查询来实现分页查询。
使用搜索后再次查询的方式来实现分页查询。

05. ElasticSearch 如何实现 from+size 分页查询?

在 ElasticSearch 中,可以使用 from 和 size 参数来进行分页搜索。 from 和 size 参数用来指定从哪个文档开始,返回多少个文档。具体命令如下:

GET /my_index/_search
{"query": {"match": {"title": "酒店"}}, "from": 0, // 从第 1 条数据开始"size": 3  // 返回 3 条数据
}

结果如下,总共有12条数据,从第1条数据开始,返回3条数据:

{"took" : 19,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.075949445,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.075949445,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.075949445,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.075949445,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}}]}
}

在上面的命令中,我们使用 from 参数指定从哪个文档开始,使用 size 参数指定返回多少个文档。例如,当 from=0 且 size=10 时,返回的是第 1 到第 10 条数据。当 from=10 且 size=10 时,返回的是第 11 到第 20 条数据。

06. ElasticSearch 如何实现 searchAfter 分页查询?

Search After API 可以用于在 Elasticsearch 中处理大量数据。它允许您在不影响性能的情况下检索大量数据。使用 Search After API,您可以在多个请求之间保持查询上下文,并在每个请求中返回一定数量的结果。这样,您就可以逐步处理大量数据,而不必一次性将所有数据加载到内存中。

Search After API 从指定的某个数据后面开始读。这种方式不能随机跳转分页,只能一页一页地读取数据,而且必须用一个唯一且不重复的属性对查询数据进行排序。

POST /my_index/_search
{"size": 3,"query": {"match": {"title": "酒店"}},"sort": [{"price": "asc"}],"track_total_hits": true
}

以上代码表示从 my_index 索引中查询 title 包含 酒店的数据,每次返回 3 条数据,并按照 price 字段升序排序。查询结果中会返回一个 sort 值,用于在后续请求中使用。同时,设置 track_total_hits 参数为 true,表示计算总命中数。

查询文档的总命中数 hits.total.value 为12,返回3条数据:

{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : null,"_source" : {"title" : "金都欣欣酒店","content" : "天津","price" : 200},"sort" : [200]},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : null,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300},"sort" : [300]},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : null,"_source" : {"title" : "金都嘉怡假日酒店","content" : "北京","price" : 337},"sort" : [337]}]}
}

接下来,可以使用 sort 值来获取下一页数据:

POST /my_index/_search
{"size": 1000,"query": {"match": {"title": "酒店"}},"sort": [{"price": "asc"}],"search_after": [337]
}
{"took" : 4,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : null,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400},"sort" : [400]},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : null,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500},"sort" : [500]},{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : null,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556},"sort" : [556]}]}
}

07. ElasticSearch 如何实现 scroll 分页查询?

Scroll API 可以用于在 Elasticsearch 中处理大量数据。它允许您在不影响性能的情况下检索大量数据。使用 Scroll API,您可以在多个请求之间保持查询上下文,并在每个请求中返回一定数量的结果。这样,您就可以逐步处理大量数据,而不必一次性将所有数据加载到内存中。

第一个查询会在内存中保存一个历史快照和光标(scroll_id)来记录当前消息查询的终止位置。下次查询会从光标记录的位置往后进行查询。这种方式性能好,一般用于海量数据导出或者重建索引。但是 scroll_id 有过期时间,两次查询之间如果 scroll_id 过期了,第二次查询会抛异常“找不到 “scroll_id”。

启用游标查询可以通过在查询的时候设置参数 scroll 的值为我们期望的游标查询的过期时间。 游标查询的过期时间会在每次做查询的时候刷新,所以这个时间只需要足够处理当前批的结果就可以了,而不是处理查询结果的所有文档的所需时间。 这个过期时间的参数很重要,因为保持这个游标查询窗口需要消耗资源,所以我们期望如果不再需要维护这种资源就该早点儿释放掉。 设置这个超时能够让 Elasticsearch 在稍后空闲的时候自动释放这部分资源。

① 执行初始查询,获取scroll_id,其中,scroll参数指定了scroll查询的有效时间,这里设置为1分钟,size 表示每次返回7条数据。

POST /my_index/_search?scroll=1m
{"size": 7,"query": {"match": {"title": "酒店"}}
}

执行上述查询后,查询结果中会返回一个 scroll_id,用于在后续请求中使用,类似于以下内容:

{"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==","took" : 0,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.06382885,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.06382885,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 0.06382885,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 0.06382885,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}}]}
}

② 使用scroll_id获取下一页数据:

POST /_search/scroll
{"scroll": "1m","scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="
}

执行上述查询后,会返回下一页数据和一个新的scroll_id:

{"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==","took" : 4,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 0.06382885,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000,"uploadTime" : 1678073241}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 0.06382885,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 0.06382885,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}},{"_index" : "my_index","_type" : "_doc","_id" : "3","_score" : 0.05390298,"_source" : {"title" : "金都欣欣酒店","content" : "天津","price" : 200}},{"_index" : "my_index","_type" : "_doc","_id" : "2","_score" : 0.046648744,"_source" : {"title" : "金都嘉怡假日酒店","content" : "北京","price" : 337}}]}
}

③ 重复步骤②,直到所有数据都被检索完毕

POST /_search/scroll
{"scroll": "1m","scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="
}
{"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==","took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.06382885,"hits" : [ ]}
}

④ 当所有数据都被检索完毕后,需要使用clear_scroll API来清除scroll_id。

DELETE /_search/scroll
{"scroll_id": ["DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==","DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="]
}

注意,scroll查询会占用Elasticsearch的资源,因此在使用时需要注意性能问题。同时,scroll查询也不适用于实时数据的查询,因为scroll查询只能查询到在scroll查询开始时已经存在的数据。

08. ElasticSearch 深分页是什么?

ElasticSearch 深分页是指在搜索结果中,需要跳过大量的文档才能到达目标文档的情况。这种情况通常发生在需要访问大量文档的搜索结果中,例如搜索结果有数百万个文档,但只需要访问其中的前几个文档。这个查询的实现原理类似于mysql中的limit。比如查询10001条数据,需要把前10000条取出来过滤,最后得到数据。

在 ElasticSearch 中,深分页可能会导致性能问题,因为每次跳过大量文档时,ElasticSearch 都需要执行一次查询,并且需要将查询结果中的所有文档加载到内存中,这会占用大量的 CPU 和内存资源。

为了避免这种情况,可以使用 ElasticSearch 的 Scroll API 或 Search After API 来进行分页查询。这些 API 可以在不加载所有文档的情况下,快速地获取搜索结果中的指定文档。

09. ElasticSearch 分页查询的最大限制是多少?

当查询页很深或者查询的数据量很大时,就会发生深分页。ElasticSearch 分页查询的最大限制是 10000 条数据,当查询条数超过10000时,会报错。

GET /my_index/_search
{"query": {"match": {"title": "酒店"}}, "from": 0,"size": 10001
}

查询结果会报错:Result window is too large, from + size must be less than or equal to: [10000] but was [10001]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.

也就是说我们最多只能分页查询10000条数据。

10. ElasticSearch 如何解除分页查询的限制?

max_result_window 属性控制从Elasticsearch中检索文档的最大数量,默认情况下,它的值为10000。可以通过修改 index.max_result_window 参数来增加搜索结果的最大数量。如果您需要检索更多的文档,请增加max_result_window的值。但是,需要注意的是,增加max_result_window的值可能会影响Elasticsearch的性能。

第一种办法:在kibana中执行,解除索引最大查询数的限制

PUT /my_index/_settings
{"index.max_result_window":200000
}

第二种办法:在创建索引的时候加上

PUT /my_index
{"settings": {"index": {"max_result_window": 10000}}
}

11. ElasticSearch 查询文档总命中数最大限制为多少?

ElasticSearch中可以根据搜索结果中的 hits.total.value 值获取查询文档的总命中数, 但最大返回条数是有限制的,默认情况下最大为 10000 条。数据量不大的情况下这个数值没问题。但是当数据超出 10000 的时候,这个 hits.total.value 将不会增长了,固定为 10000,这个时候的匹配文档数量统计就不准了。

如集群中总共有30000条文档,查询所有时 hits.total.value 的值却为10000:

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 10000,"relation" : "eq"},"max_score" : null,"hits" : [// ...]}
}

12. ElasticSearch 如何解除查询文档总命中数的限制?

Elasticsearch 的 track_total_hits 参数用于控制查询时是否计算总命中数,如果想要统计准确的匹配文档数,需要使用参数 track_total_hits 来开启精确匹配。默认情况下会计算前10000条数据的总命中数,如果想解除这个限制,需要将track_total_hits 参数设置为true。

track_total_hits 参数有三种取值:

true:计算总命中数。
false:不计算总命中数。
数字:只计算前 n 条数据的总命中数。

① 计算总命中数:

GET /my_index/_search
{"query": {"match": {"title": "酒店"}},"track_total_hits": true
}

查询文档的总命中数 hits.total.value 值为12,文档列表 hits.hits 中10条文档(from=0,size=10)

{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.06382885,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.06382885,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 0.06382885,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 0.06382885,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 0.06382885,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000,"uploadTime" : 1678073241}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 0.06382885,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 0.06382885,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}}]}
}

② 不计算总命中数:

GET /my_index/_search
{"query": {"match": {"title": "酒店"}},"track_total_hits": false
}

查询结果中不返回总命中数 hits.total.value ,文档列表 hits.hits 中10条文档(from=0,size=10)

{"took" : 8,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.06382885,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.06382885,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 0.06382885,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 0.06382885,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 0.06382885,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000,"uploadTime" : 1678073241}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 0.06382885,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 0.06382885,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}}]}
}

③ 只计算前5条数据的总命中数:

GET /my_index/_search
{"query": {"match": {"title": "酒店"}},"track_total_hits": 5
}

前5条数据的总命中数 hits.total.value 值为5,文档列表 hits.hits 中10条文档(from=0,size=10)

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 5,"relation" : "gte"},"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.06382885,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.06382885,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 0.06382885,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 0.06382885,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 0.06382885,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000,"uploadTime" : 1678073241}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 0.06382885,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 0.06382885,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}}]}
}

④ 计算前15条文档的总命中数:

GET /my_index/_search
{"query": {"match": {"title": "酒店"}},"track_total_hits": 15
}

前15条数据的总命中数 hits.total.value 值为12,文档列表 hits.hits 中10条文档(from=0,size=10)

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 12,"relation" : "eq"},"max_score" : 0.06382885,"hits" : [{"_index" : "my_index","_type" : "_doc","_id" : "1","_score" : 0.06382885,"_source" : {"title" : "文雅酒店","content" : "青岛","price" : 556}},{"_index" : "my_index","_type" : "_doc","_id" : "4","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "上海","price" : 300}},{"_index" : "my_index","_type" : "_doc","_id" : "5","_score" : 0.06382885,"_source" : {"title" : "自如酒店","content" : "南京","price" : 400}},{"_index" : "my_index","_type" : "_doc","_id" : "6","_score" : 0.06382885,"_source" : {"title" : "如家酒店","content" : "杭州","price" : 500}},{"_index" : "my_index","_type" : "_doc","_id" : "7","_score" : 0.06382885,"_source" : {"title" : "非常酒店","content" : "合肥","price" : 600}},{"_index" : "my_index","_type" : "_doc","_id" : "9","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮南","price" : 900}},{"_index" : "my_index","_type" : "_doc","_id" : "8","_score" : 0.06382885,"_source" : {"title" : "金都酒店","content" : "淮北","price" : 700}},{"_index" : "my_index","_type" : "_doc","_id" : "10","_score" : 0.06382885,"_source" : {"title" : "丽舍酒店","content" : "阜阳","price" : 1000,"uploadTime" : 1678073241}},{"_index" : "my_index","_type" : "_doc","_id" : "11","_score" : 0.06382885,"_source" : {"title" : "文轩酒店","content" : "蚌埠","price" : 1020}},{"_index" : "my_index","_type" : "_doc","_id" : "12","_score" : 0.06382885,"_source" : {"title" : "大理酒店","content" : "长沙","price" : 1100}}]}
}

13. ElasticSearch 分页查询的性能优化有哪些?

尽量减少查询的字段,只查询需要的字段。
尽量减少查询的数据量,只查询需要的数据。
使用 scroll 查询或者搜索后再次查询的方式来避免过多的分页查询。
使用索引优化技术,如分片、副本等来提高查询性能。

14. SpringBoo整合ES实现:from+size 分页查询?

GET /my_index/_search
{"query": {"match": {"title": "酒店"}}, "from": 0, // 从第 1 条数据开始"size": 3  // 返回 3 条数据
}
@Slf4j
@Service
public class ElasticSearchImpl {@Autowiredprivate RestHighLevelClient restHighLevelClient;public void searchUser() throws IOException {SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// query 查询MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");searchSourceBuilder.query(matchQueryBuilder);// 分页查询int page = 1; // 第1页int pageSize = 3; // 每页返回3条数据searchSourceBuilder.from((page-1)*pageSize);searchSourceBuilder.size(pageSize);SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);// 搜索结果SearchHits searchHits = searchResponse.getHits();SearchHit[] hits = searchHits.getHits();for (SearchHit hit : hits) {// hits.hits._source:匹配的文档的原始数据String sourceAsString = hit.getSourceAsString();}System.out.println(searchResponse);}
}

15. SpringBoo整合ES实现:searchAfetr 分页查询?

POST /my_index/_search
{"size": 3,"query": {"match": {"title": "酒店"}},"sort": [{"price": "asc"}],"track_total_hits": true
}
POST /my_index/_search
{"size": 1000,"query": {"match": {"title": "酒店"}},"sort": [{"price": "asc"}],"search_after": [337]
}
@Slf4j
@Service
public class ElasticSearchImpl {@Autowiredprivate RestHighLevelClient restHighLevelClient;public void searchUser() throws IOException {SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// query 查询MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");searchSourceBuilder.query(matchQueryBuilder);// 计算总命中数:track_total_hitssearchSourceBuilder.trackTotalHits(true);// 每次返回3条数据searchSourceBuilder.size(3);// 设置排序字段searchSourceBuilder.sort(SortBuilders.fieldSort("price").order(SortOrder.ASC));SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);List<Map<String, Object>> result = new ArrayList<>();while (searchResponse.getHits().getHits()!=null && searchResponse.getHits().getHits().length>0){SearchHit[] hits = searchResponse.getHits().getHits();for (SearchHit hit : hits) {Map<String, Object> sourceAsMap = hit.getSourceAsMap();result.add(sourceAsMap);}// 取得最后一条数据的排序值sort,下次查询时将从这个地方开始取数Object[] lastNum = hits[hits.length - 1].getSortValues();searchSourceBuilder.searchAfter(lastNum);searchRequest.source(searchSourceBuilder);// 做下次查询searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);}System.out.println(result);}
}

16. SpringBoo整合ES实现:scroll 分页查询?

@Slf4j
@Service
public class ElasticSearchImpl {@Autowiredprivate RestHighLevelClient restHighLevelClient;public void search() throws IOException {SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// query 查询MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");searchSourceBuilder.query(matchQueryBuilder);// 计算总命中数:track_total_hitssearchSourceBuilder.trackTotalHits(true);// 每次返回7条数据searchSourceBuilder.size(7);// 设置排序字段searchSourceBuilder.sort(SortBuilders.fieldSort("price").order(SortOrder.ASC));SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);// 指定游标的过期时间searchRequest.scroll(TimeValue.timeValueMinutes(1L));SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);// 获取 scrollIdString scrollId = searchResponse.getScrollId();SearchHit[] searchHits = searchResponse.getHits().getHits();List<Map<String, Object>> result = new ArrayList<>();for (SearchHit hit: searchHits) {result.add(hit.getSourceAsMap());}while (true) {// 根据 scrollId 查询下一页数据SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);// 指定游标的过期时间scrollRequest.scroll(TimeValue.timeValueMinutes(1L));SearchResponse scrollResp = restHighLevelClient.scroll(scrollRequest, RequestOptions.DEFAULT);SearchHit[] hits = scrollResp.getHits().getHits();if (hits != null && hits.length > 0) {for (SearchHit hit : hits) {result.add(hit.getSourceAsMap());}} else {break;}}System.out.println(result);// After checking, we delete the id stored in the cache. After scrolling, clear the scrolling contextClearScrollRequest clearScrollRequest = new ClearScrollRequest();clearScrollRequest.addScrollId(scrollId);ClearScrollResponse clearScrollResponse = restHighLevelClient.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);boolean succeeded = clearScrollResponse.isSucceeded();System.out.println(succeeded);restHighLevelClient.close();}
}

ElasticSearch系列 - SpringBoot整合ES:实现分页搜索 from+size、search after、scroll相关推荐

  1. ElasticSearch系列 - SpringBoot整合ES:分析器

    文章目录 01. ElasticSearch 分析器 1. ElasticSearch match 文本搜索的过程? 2. ElasticSearch 分析器是什么? 3. ElasticSearch ...

  2. ElasticSearch系列 - SpringBoot整合ES:短语匹配查询 match_phrase

    文章目录 1. ElasticSearch match_phrase查询是什么?它与match查询有什么区别? 2. ElasticSearch match_phrase 查询的语法是什么? 3. E ...

  3. ElasticSearch系列 - SpringBoot整合ES:多个精确值查询 terms

    文章目录 01. ElasticSearch terms 查询支持的数据类型 02. ElasticSearch term和 terms 查询的区别 03. ElasticSearch terms 查 ...

  4. SpringBoot整合ES高级查询

    SpringBoot整合ES高级查询 springboot版本:2.0.5.RELEASE elasticsearch版本:7.9.1 1.配置 引入依赖: <dependency>< ...

  5. 用SpringBoot整合ES数据库基础

    一.SpringBoot整合ES数据库 1.配置原生的依赖. <properties><java.version>1.8</java.version><!-- ...

  6. Springboot整合ES,ES版不一致

    本文记录的是:在Springboot整合ES中遇到的一些事 问题描述 最近想要提升自己的能力(其实就是被逼无奈),去学习了Elasticsearch:官方分布式搜索和分析引擎,在学完基础知识后(其实就 ...

  7. springboot整合es启动报错的问题

    今天打算用springboot整合es创建一个索引并往索引里面写数据的时候,项目启动的时候一直报下面的这个错误,错误大概如下, Caused by: org.springframework.beans ...

  8. SpringBoot整合mybatis+mybatis分页插件

    第一步:相关依赖 <!--web,servlet引入--> <dependency><groupId>org.springframework.boot</gr ...

  9. SpringBoot整合es提示错误:ElasticsearchException[Invalid or missing build flavor [oss]]

    文章目录 解析 修改版本 错误详情 SpringBoot整合es提示错误:ElasticsearchException[Invalid or missing build flavor [oss]] 解 ...

最新文章

  1. 常用排序算法的C++实现
  2. Creative Web Typography Styles | Codrops
  3. sql的使用详解(针对oeacle)之select(上)
  4. jQuery的祖先遍历
  5. 排序算法之——冒泡排序优化
  6. jdk1.5新特性5之枚举之模拟枚举类型
  7. linux 使用rpm卸载软件的使用方法
  8. qint64转字符串
  9. Linux中级之lvs三个模式的图像补充(nat,dr,tun)
  10. jadx重新打包_Android改机系列:一.Android一键新机原理刨析
  11. 流体力学有限元法(一)
  12. 教你简单3步搞定——微信快速添加个人表情包
  13. 51c语言延时程序怎么编写,C51中延时程序的编写
  14. RL 笔记(3)PPO(Proximal Policy Optimization)近端策略优化
  15. 多核 CPU 和多个 CPU 有何区别?与线程的关系?
  16. ffmpeg 图片序列转视频
  17. 天梯赛 L2-001 紧急救援 (25 分)
  18. java unix 时间戳_「unix时间戳」Unix时间戳和Java中的时间戳的区别 - seo实验室
  19. HTML基础教程笔记
  20. 我们如何使用Firestore和Firetable构建虚拟实时事件平台

热门文章

  1. el-Dropdown踩坑1
  2. 新消费大局下,谁率先实现了从新产品到新居住转型?
  3. linux url解码工具,Linux C语言实现urlencode和urldecode
  4. 上手结巴分词文本分析,输出热词、TF-IDF权重和词频
  5. android摄像头自动对焦原理,Android实现手机摄像头的自动对焦
  6. TimeUnit的简略介绍
  7. Java——IO两个小实例
  8. loam_livox_loop 论文翻译
  9. 天翼云、移动云、联通云会切走多大块的云市场“蛋糕”?
  10. wavecom at指令