尝鲜

GET /forum/article/_search

{

"query": {

"match_phrase": {

"title": {

"query": "java spark",

"slop":  1

}

}

}

}

结果:

{

"took": 1,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 0,

"max_score": null,

"hits": []

}

}

slop(移动)的含义是什么?

query string,搜索文本,中的几个term,要经过几次移动才能与一个document匹配,这个移动的次数,就是slop

slop实际移动举例

实际举例,一个query string经过几次移动之后可以匹配到一个document,然后设置slop

hello world, java is very good, spark is also very good.

java spark,match phrase,搜不到

如果我们指定了slop,那么就允许java spark进行移动,来尝试与doc进行匹配

java           is               very           good         spark         is

java     spark

java        -->        spark                      移动一位

java            -->                        spark             移动两位

java             -->                          spark   移动三位

这里的slop,就是3,因为java spark这个短语,spark移动了3次,就可以跟一个doc匹配上了

slop的含义,不仅仅是说一个query string terms移动几次,跟一个doc匹配上。而是说,一个query string terms,最多可以移动几次去尝试跟一个doc匹配上

slop,设置的是3,那么就ok

GET /forum/article/_search

{

"query": {

"match_phrase": {

"title": {

"query": "spark data",

"slop":  3

}

}

}

}

就可以把刚才那个doc匹配上,那个doc会作为结果返回

但是如果slop设置的是2,那么java spark,spark最多只能移动2次,此时跟doc是匹配不上的,那个doc是不会作为结果返回的

做实验,验证slop的含义

实验一

GET /forum/article/_search

{

"query": {

"match_phrase": {

"content": {

"query": "spark data",

"slop": 3

}

}

}

}

结果:

{

"took": 1,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 0,

"max_score": null,

"hits": []

}

}

实验二

GET /forum/article/_search

{

"query": {

"match_phrase": {

"content": {

"query": "spark data",

"slop": 2

}

}

}

}

结果

{

"took": 1,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 0,

"max_score": null,

"hits": []

}

}

实验三

GET /forum/article/_search

{

"query": {

"match_phrase": {

"content": {

"query": "spark data",

"slop": 3

}

}

}

}

结果:

{

"took": 1,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 1,

"max_score": 0.21824157,

"hits": [

{

"_index": "forum",

"_type": "article",

"_id": "5",

"_score": 0.21824157,

"_source": {

"articleID": "DHJK-B-1395-#Ky5",

"userID": 3,

"hidden": false,

"postDate": "2017-03-01",

"tag": [

"elasticsearch"

],

"tag_cnt": 1,

"view_cnt": 10,

"title": "this is spark blog",

"content": "spark is best big data solution based on scala ,an programming language similar to java spark",

"sub_title": "haha, hello world",

"author_first_name": "Tonny",

"author_last_name": "Peter Smith"

}

}

]

}

}

Spark  is  best  big  data  solution based on scala ,an programming language similar to java spark

spark data

--> data  移动一位

-->  data  移动两位

spark               -->   data  移动三位

实验四增强

GET /forum/article/_search

{

"query": {

"match_phrase": {

"content": {

"query": "data spark",

"slop": 5

}

}

}

}

结果:

{

"took": 1,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 1,

"max_score": 0.154366,

"hits": [

{

"_index": "forum",

"_type": "article",

"_id": "5",

"_score": 0.154366,

"_source": {

"articleID": "DHJK-B-1395-#Ky5",

"userID": 3,

"hidden": false,

"postDate": "2017-03-01",

"tag": [

"elasticsearch"

],

"tag_cnt": 1,

"view_cnt": 10,

"title": "this is spark blog",

"content": "spark is best big data solution based on scala ,an programming language similar to java spark",

"sub_title": "haha, hello world",

"author_first_name": "Tonny",

"author_last_name": "Peter Smith"

}

}

]

}

}

spark             is                          best        big                data

data          spark

-->               data/spark   移动一位

spark          àdata     移动两位

spark             -->                      data     移动三位

spark                                         -->               data    移动四位

spark                                                              -->               data    移动五位

slop搜索下,关键词离的越近,relevance score就会越高,做实验说明。。。

{

"took": 4,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 3,

"max_score": 1.3728157,

"hits": [

{

"_index": "forum",

"_type": "article",

"_id": "2",

"_score": 1.3728157,

"_source": {

"articleID": "KDKE-B-9947-#kL5",

"userID": 1,

"hidden": false,

"postDate": "2017-01-02",

"tag": [

"java"

],

"tag_cnt": 1,

"view_cnt": 50,

"title": "this is java blog",

"content": "i think java is the best programming language",

"sub_title": "learned a lot of course",

"author_first_name": "Smith",

"author_last_name": "Williams",

"new_author_last_name": "Williams",

"new_author_first_name": "Smith"

}

},

{

"_index": "forum",

"_type": "article",

"_id": "5",

"_score": 0.5753642,

"_source": {

"articleID": "DHJK-B-1395-#Ky5",

"userID": 3,

"hidden": false,

"postDate": "2017-03-01",

"tag": [

"elasticsearch"

],

"tag_cnt": 1,

"view_cnt": 10,

"title": "this is spark blog",

"content": "spark is best big data solution based on scala ,an programming language similar to java spark",

"sub_title": "haha, hello world",

"author_first_name": "Tonny",

"author_last_name": "Peter Smith",

"new_author_last_name": "Peter Smith",

"new_author_first_name": "Tonny"

}

},

{

"_index": "forum",

"_type": "article",

"_id": "1",

"_score": 0.28582606,

"_source": {

"articleID": "XHDK-A-1293-#fJ3",

"userID": 1,

"hidden": false,

"postDate": "2017-01-01",

"tag": [

"java",

"hadoop"

],

"tag_cnt": 2,

"view_cnt": 30,

"title": "this is java and elasticsearch blog",

"content": "i like to write best elasticsearch article",

"sub_title": "learning more courses",

"author_first_name": "Peter",

"author_last_name": "Smith",

"new_author_last_name": "Smith",

"new_author_first_name": "Peter"

}

}

]

}

}

实验

GET /forum/article/_search

{

"query": {

"match_phrase": {

"content": {

"query": "java best",

"slop": 15

}

}

}

}

结果

{

"took": 3,

"timed_out": false,

"_shards": {

"total": 5,

"successful": 5,

"failed": 0

},

"hits": {

"total": 2,

"max_score": 0.65380025,

"hits": [

{

"_index": "forum",

"_type": "article",

"_id": "2",

"_score": 0.65380025,

"_source": {

"articleID": "KDKE-B-9947-#kL5",

"userID": 1,

"hidden": false,

"postDate": "2017-01-02",

"tag": [

"java"

],

"tag_cnt": 1,

"view_cnt": 50,

"title": "this is java blog",

"content": "i think java is the best programming language",

"sub_title": "learned a lot of course",

"author_first_name": "Smith",

"author_last_name": "Williams",

"new_author_last_name": "Williams",

"new_author_first_name": "Smith"

}

},

{

"_index": "forum",

"_type": "article",

"_id": "5",

"_score": 0.07111243,

"_source": {

"articleID": "DHJK-B-1395-#Ky5",

"userID": 3,

"hidden": false,

"postDate": "2017-03-01",

"tag": [

"elasticsearch"

],

"tag_cnt": 1,

"view_cnt": 10,

"title": "this is spark blog",

"content": "spark is best big data solution based on scala ,an programming language similar to java spark",

"sub_title": "haha, hello world",

"author_first_name": "Tonny",

"author_last_name": "Peter Smith",

"new_author_last_name": "Peter Smith",

"new_author_first_name": "Tonny"

}

}

]

}

}

其实,加了slop的phrase match,就是proximity match,近似匹配

1、java spark,短语,doc,phrase match

2、java spark,可以有一定的距离,但是靠的越近,越先搜索出来,proximity match

移动搜索的短语,以达到文档的内容

进阶-第18__深度探秘搜索技术_基于slop参数实现近似匹配以及原理剖析和相关实验相关推荐

  1. 白话Elasticsearch18-深度探秘搜索技术之基于slop参数实现近似匹配以及原理剖析

    文章目录 概述 官网 slop 含义 例子 示例一 示例二 示例三 概述 继续跟中华石杉老师学习ES,第18篇 课程地址: https://www.roncoo.com/view/55 接上篇博客 白 ...

  2. 22_深度探秘搜索技术_手动控制全文检索(match)结果的精准度、基于boost的细粒度搜索条件实现权重控制...

    本文章收录于[Elasticsearch 系列],将详细的讲解 Elasticsearch 整个大体系,包括但不限于ELK讲解.ES调优.海量数据处理等 本博客以例子为主线,来说明在elasticse ...

  3. 白话Elasticsearch11-深度探秘搜索技术之基于tie_breaker参数优化dis_max搜索效果

    文章目录 概述 官方文档 例子 tie_breaker 概述 继续跟中华石杉老师学习ES,第十一篇 课程地址: https://www.roncoo.com/view/55 官方文档 https:// ...

  4. 白话Elasticsearch20-深度探秘搜索技术之使用rescoring机制优化近似匹配搜索的性能

    文章目录 概述 官网 match和phrase match(proximity match)区别 优化proximity match的性能 概述 继续跟中华石杉老师学习ES,第19篇 课程地址: ht ...

  5. 23_深度探秘搜索技术_best fields策略的dis_max、tie_breaker参数以及multi_match语法

    目录 一.引入dis_max 实现best fields 的必要性 1.使用bulk批量添加测试数据 2.搜索title或content中包含java或solution的帖子 3.结果分析 二.bes ...

  6. Elasticsearch深度探秘搜索技术如何手动控制全文检索结果的精准度

    为帖子数据增加标题字段 #插入数据 POST /post/_doc/_bulk { "update": { "_id": "1"} } { ...

  7. Elasticsearch深度探秘搜索技术基于multi_match语法实现dis_max+tie_breaker

    直接上代码 GET /post/_search {"query": {"multi_match": {"query": "java ...

  8. 白话Elasticsearch14-深度探秘搜索技术之基于multi_match 使用most_fields策略进行cross-fields search弊端

    文章目录 概述 官网 示例 概述 继续跟中华石杉老师学习ES,第十四篇 课程地址: https://www.roncoo.com/view/55 官网 https://www.elastic.co/g ...

  9. 白话Elasticsearch13-深度探秘搜索技术之基于multi_match+most fields策略进行multi-field搜索

    文章目录 概述 官网 示例 构造模拟数据 普通查询 使用 multi_match + most fileds查询 best fields VS most fields 概述 继续跟中华石杉老师学习ES ...

  10. 白话Elasticsearch12-深度探秘搜索技术之基于multi_match + best fields语法实现dis_max+tie_breaker

    文章目录 概述 官网 示例 概述 继续跟中华石杉老师学习ES,第十二篇 课程地址: https://www.roncoo.com/view/55 官网 https://www.elastic.co/g ...

最新文章

  1. Spark on k8s: 通过hostPath设置SPARK_LOCAL_DIRS加速Shuffle
  2. 终端界面如何改成彩色的
  3. LeetCode 1679. K 和数对的最大数目(哈希)
  4. 使用std::function 把类成员函数指针转换为普通函数指针
  5. java php rsa加密解密算法_PHP rsa加密解密算法原理解析
  6. selenium处理动态加载数据
  7. CentOS 6系统FreeSwitch和RTMP服务 安装及演示(四)
  8. 2021-2027中国游戏开发工具市场现状及未来发展趋势
  9. python123九宫格输入法_python制作朋友圈九宫格图片
  10. php 抓取百度快照时间,php获取网站百度快照日期的方法
  11. 2008年IT行业10大热门职业调查结果出炉
  12. 主流搜索引擎分析[特点、功能、市场份额、应用领域]
  13. scilab中文简介
  14. BUUCTF·[MRCTF2020]天干地支+甲子·WP
  15. .NET调用百度天气api经验
  16. Android打印小票速度太慢,解决打印PDF打印机打印速度慢的问题(适用所有打印机)...
  17. JavaScript-设计模式(四) 原型模式
  18. 全金属狂怒云上计算机密码,【攻略向】游戏中所有装备解锁地点
  19. java mail 邮箱发送_Java Mail 发送邮件
  20. Basler 工业相机 Python开发采集数据、保存照片

热门文章

  1. NS2:添加一个新的流量发生器(poisson分布)
  2. Rust FFI 编程 - nix crate
  3. 如何屏蔽移动垃圾短信10658464
  4. Docker使用教程超详细
  5. 仓库盘点好方法,使用安卓盘点机PDA扫描商品条码进行超市盘点
  6. C++ limits头文件的用法(numeric_limits)
  7. 输入国家名按字典顺序进行排序
  8. 自制APP连接远程服务器
  9. mysql按升序创建索引_MySQL 降序索引 (Descending Indexes)
  10. 40行代码的Python爬虫案例:虎牙-王者荣耀主播的人气排行