文章目录

es下term级别的查询语句
- term-level queries 用于query或filter上下文
- - exists 存在性查询
  - fuzzy 查询
  - ids 批量id查询
  - prefix 前缀查询
  - range 范围查询
  - regexp 正则查询
  - term 精确匹配
  - terms 多值精确匹配
  - - terms lookup 查询
  - terms_set 多值集合精确匹配
  - wildcard 通配符匹配
- 通用参数
- - rewrite

es下term级别的查询语句

term-level queries 用于query或filter上下文

exists query：返回包含索引数据字段的文档，即字段存在
fuzzy query：返回 terms 和搜索 term 相近的文档，es使用 Levenshtein edit distance 规则查看搜索相关性（最短距离编辑），如 cat 和 cot，can 等只有一位距离差
ids query：返回基于 doc ids 的文档（Returns documents based on their document IDs.）
prefix query：返回包含特定前缀文本在提供的字段内的文档（Returns documents that contain a specific prefix in a provided field）
range query：返回包含特定 terms 在提供的范围内的文档（Returns documents that contain terms within a provided range.）
regexp query：包含的 terms 匹配正则表达式的文档
term query：包含的精确 term 在提供的字段
terms query：包含1或多个精确 terms 在提供的字段
terms_set query：包含最小精确匹配 terms 数量在提供的字段，可用字段或 script 定义最小匹配的 terms
type query：返回指定 type 的文档（7废弃）
wildcard query：包含 terms 匹配通配符的文档

exists 存在性查询

exists：辅助参数

field：字段必须存在，且包含值，值不能是 null or []，值可以是 1. 空字符串如""，2. 数组包含 null 和其他值如 [null, “foo”]，3. mapping 定义的 null-value

存在：
"exists": {"field": "user"
}
不存在：
"bool": {"must_not": {"exists": {"field": "user"}}
}

fuzzy 查询

示例：

    "query": {"fuzzy": {"user": {"value": "ki"  # 匹配 user:kk, kii, k, ik}}}

可编辑距离默认是1，1个字符的概念变为相似的 term，这些改变有：

box -> fox : 修改一个字符
black -> lack：移除一个字符
sic -> sick：插入一个字符
act -> cat：换位两个相邻字符

fuzzy顶层辅助参数：

field：用于搜索的字段，必填

field 的辅助参数：

value：用于在 field 搜索的文本，必填字符串
fuzziness：最大可编辑距离，有效值0，1，2 和 AUTO，选填字符串

AUTO:[low],[high] ： 在 low - high 的文本中进行动态可编辑距离运算 输入文本字符长度不一样 可编辑距离不一样
AUTO 默认：AUTO:3,6
AUTO:3,6 指的是：value:ab 精确匹配 value:abc fuzziness为1 value:abcdef fuzziness为20-2个字符 必须精确匹配3-5个字符 最大1可编辑距离6-]个字符 最大2可编辑距离

max_expansions：默认 50，创建变化的最大值，会按字母顺序搜索产生变化的 term 当搜索没有结果或数量超过 max_expansions 时结束，配合 prefix_length ，若 prefix_length 为0不建议调大该值，会出现性能问题（注意在部分场景该参数是分片级别，只能在一个分片内限制50），选填整型
prefix_length：默认0，左起开始字符的数量，开始字符左不进行改变，右创建表达式，选填整型
transpositions：默认 true，是否支持相邻字符交换 ab->ba，选填布尔
rewrite：支持查询语句改写，把费时的原始查询类型实例改写成一个性能更高的查询类型实例

有效值：
constant_score (Default)：对于更少的 terms 匹配使用 constant_score_boolean 其他使用 a bit set 位图
constant_score_boolean：给每个文档分配等于 boost 参数的相关性得分该方法将原始query转为一个bool query，该bool query 包含一个should子句和每个匹配关键的 term query该方法可能导致最终的 bool query 会超过 indices.query.bool.max_clause_count 设置的子句限制，若超过限制es返回error
scoring_boolean：为每个匹配文档计算相关性得分，cpu比constant_score_boolean消耗高，其他规则类似 constant_score_boolean
top_terms_blended_freqs_N
top_terms_boost_N
top_terms_N：保留前N个关键词
推荐使用 constant_score, constant_score_boolean, or top_terms_boost_N

示例:

GET /_search
{"query": {"fuzzy": {"user": {"value": "ki","fuzziness": "AUTO","max_expansions": 50,  # 最多生成50个相关词"prefix_length": 0, # 若为1 则k必须前缀匹配"transpositions": true,"rewrite": "constant_score"}}}
}

ids 批量id查询

ids 查询基于存储的 _id 字段，辅助参数只有 values，数组

示例

GET /_search
{"query": {"ids" : {"values" : ["1", "4", "100"]}}
}

prefix 前缀查询

前缀匹配

GET /_search
{"query": {"prefix": {"user": {"value": "ki","rewrite": "constant_score"}}}
}
shortcut
GET /_search
{"query": {"prefix": {"user": "ki"}}
}

prefix 辅助参数

field：搜索字段

field 辅助参数：

value：前缀文本
rewrite：查询语句重写，和fuzzy的参数一致

支持 short request

GET /_search
{"query": {"prefix" : { "user" : "ki" } # 简化查询}
}

range 范围查询

range query 将匹配到检索字段在一定范围内的文档。Lucene查询的类型取决于字段类型，对于string字段为TermRangeQuery，对于数字/日期字段，类型为 NumericRangeQuery

示例：

# age between 10 and 20
GET _search
{"query": {"range" : {"age" : {"gte" : 10,"lte" : 20,"boost" : 2.0}}}
}

range 辅助参数：field

field 辅助参数：

gt：大于 >
gte：大于等于 >=
lt：小于 <
lte：小于等于 <=
format：用于转换查询中 date 类型的值，默认使用mappings中配置的format，此处指定会覆盖

y 年 M 月 w 周 d 日 H 时 m 分 s 秒
支持内建format：epoch_millis：毫秒值范围在 Long.MIN_VALUE and Long.MAX_VALUEepoch_second：秒值范围在 Long.MIN_VALUE and Long.
MAX_VALUE divided by 1000 （1秒=1000毫秒）date / strict_date : yyyy-MM-dd
https://www.elastic.co/guide/en/elasticsearch/reference/7.1/mapping-date-format.html

relation：控制两个范围值，一个是文档field的value，一个是检索范围

取值：INTERSECTS：（为默认值）文档的范围字段与检索关键词的范围有交集即可CONTAINS：文档的范围字段完全包含检索关键词的范围WITHIN：文档的范围字段要完全在检索关键词的范围里
文档指定range字段存放数据 {"gte":13,"lte":15}
以下查询可查出字段[13,15]该文档"range" : {"influence": {"gte" : 12,"lte" : 17,"relation" : "within"}}

time_zone：转换 IANA time zone 时区，支持 ISO 8601 UTC offsets 时间偏移

有效值：
ISO 8601 UTC offsets：+01:00 or -08:00 ：向前走1小时 或 向后走8小时IANA time zone 时区：America/Los_Angeles
注意：
The time_zone parameter does not affect the date math value of now. now is always the current system time in UTC.
now 一直是当前系统的UTC时间

boost：默认1.0，提高/降低 range query 的相关性得分

Date Math:

时间运算支持三种：加减，取整；支持运算的时间格式为 now 或以 || 结尾的时间字符串

+1h：加一小时
-1d：减一天
/d：向下取整最近的一天

支持的时间单位有：y-Years、M-Months、w-Weeks、d-Days、h-Hours、H-Hours、m-Minutes、s-Seconds

now 当前时间： now-1h/d，当前时间减1小时然后按天单位向下取整

当对字符串时间进行运算时，其字符串以||结尾：2014-11-18||/M

示例 now is 2001-01-01 12:00:00

now-1h/d: 2001-01-01 00:00:00
2001.02.01\|\|+1M/d: 2001-03-01 00:00:00

Date math and rounding：针对 2014-11-18||/M 的取整

gt：向上取整，2014-11-18||/M rounds up to 2014-11-30T23:59:59.999，跳过当月
gte：向下取整， 2014-11-18||/M rounds down to 2014-11-01，包括当月
lt：向下取整，2014-11-18||/M rounds down to 2014-11-01，排除当月
lte：向上取整： 2014-11-18||/M rounds up to 2014-11-30T23:59:59.999, 包括当月

示例：

GET _search
{"query": {"range" : {"timestamp" : {"gte" : "now-1d/d", # 当前时间-1天，向下按单位天取整"lt" :  "now/d" # 当前时间向下按天取整}}}
}
GET _search
{"query": {"range" : {"timestamp" : {"time_zone": "+01:00", "gte": "2015-01-01 00:00:00",  # time_zone 的原因变为 2014-12-31T23:00:00 UTC"lte": "now" # now 不受 time_zone 影响 The time_zone parameter does not affect the now value}}}
}

regexp 正则查询

示例：

GET /_search
{"query": {"regexp": {"user": {"value": "k.*y", # k前缀 y后缀"flags" : "ALL","max_determinized_states": 10000,"rewrite": "constant_score"}}}
}

regexp 的辅助参数：field

field 的辅助参数：

value：正则表达式 syntax，默认限制 1000 字符，通过 index.max_regex_length 控制（尽量避免使用通配符 .* or .*?+, without a prefix or suffix）
flags：正则表达式操作符生效

取值：ALL (Default):所有可选操作符生效COMPLEMENT: ~ 否定最短操作符生效；a~bc   # matches 'adc' and 'aec' but not 'abc'INTERVAL: <> 数字范围操作符生效；foo<1-100>      # matches 'foo1', 'foo2' ... 'foo99', 'foo100'foo<01-100>     # matches 'foo01', 'foo02' ... 'foo99', 'foo100'INTERSECTION: & 交集操作符生效；aaa.+&.+bbb  # matches 'aaabbb' 同时匹配左和右ANYSTRING: @ 匹配任一完整字符串操作符生效@&~(abc.+)  # matches everything except terms beginning with 'abc'

max_determinized_states：默认10000，Maximum number of automaton states required for the query
rewrite：查询语句重写

term 精确匹配

用于匹配精确值，如价格，产品id，用户名称

注意：避免对 text 类型字段使用 term 查询

示例：

GET /_search
{"query": {"term": {"user": {"value": "Kimchy","boost": 1.0}}}
}

term 辅助参数：field

field 辅助参数：

value：精确匹配文本，精确匹配包含空格
boost：默认1.0，权重

支持 shortcut 格式

GET my_index/_search?pretty
{"query": {"term": {"full_text": "Quick Brown Foxes!" # 忽略权重}}
}

terms 多值精确匹配

和term 辅助参数一致

示例：

GET /_search
{"query" : {"terms" : {"user" : ["kimchy", "elasticsearch"],"boost" : 1.0}}
}

注意：terms 会影响 Highlighting 返回结果，基于 Highlighting type 和 terms 在查询中的数量

terms lookup 查询

Terms lookup fetch（拉取）存在文档指定字段的值，然后用拉取的值作为搜索的 terms 查询指定字段，因为 terms lookup 要 fetch 文档字段的值，所以使用该方式查询要使 _source 包含该字段

terms lookup 辅助参数：

index：指定索引库
id：指定 _id，文档id
path：指定 fetch 的字段名称，若该字段是多值数组，可用 . dot去获取指定对象
routing：支持分片路由查询，需要 _routing 元字段配合，当自定义 _routing 时，routing 参数是必须的，否则 _routing 默认是 _id 的值

GET my_index/_search?pretty
{"query": {"terms": {"color" : {"index" : "my_index", # 索引库"id" : "2",   # 指定id的文档"path" : "color"  # fetch color这个字段的值}}}
}

terms_set 多值集合精确匹配

最小数量的精确匹配集合，terms_set query 和 terms query 类似，除了terms_set 还定义了最小匹配的数量必须指定字段

示例

PUT /job-candidates/_doc/1?refresh
{"name": "Jane Smith","programming_languages": ["c++", "java"],"required_matches": 2
}
GET /job-candidates/_search
{"query": {"terms_set": {"programming_languages": {"terms": ["c++", "java", "php"],"minimum_should_match_field": "required_matches"}}}
}

terms_set 辅助参数：field

field 辅助参数：

terms：多值数组，需要匹配的数量由下面两个参数定义
minimum_should_match_field：需要匹配的 terms 数量，指定对应的字段
minimum_should_match_script：需要匹配的 terms 数量，指定对应的脚本

示例：

GET /job-candidates/_search
{"query": {"terms_set": {"programming_languages": {"terms": ["c++", "java", "php"],"minimum_should_match_script": {"source": "Math.min(params.num_terms, doc['required_matches'].value)"  # 指定字段和 terms 数组size 的最小值},"boost": 1.0}}}
}

wildcard 通配符匹配

wildcard 操作符是一个匹配一或多个字符的占位符，如 * 匹配零或多个字符

示例

GET /_search
{"query": {"wildcard": {"user": {"value": "ki*y", # 前缀ki 后缀y 匹配 kiy kity kimcy"boost": 1.0,"rewrite": "constant_score"}}}
}
shortcut
GET /_search
{"query": {"wildcard": {"user": "ki*y"}}}
}

wildcard 辅助参数：field

field 辅助参数：

value：通配符文本，? 匹配任一单个字符，* 匹配零或多个字符；避免用 * 或 ? 做前缀，增加性能损耗
boost：权重
rewrite：查询语句重写

通用参数

rewrite

为了执行 fuzzy，prefix，query_string，regexp，wildcard，为了执行以上的语句，Lucene 改变原有的query语句简化为 bool query 或 a bit set 位图

rewrite：支持查询语句改写，把费时的原始查询类型实例改写成一个性能更高的查询类型实例

有效值：
constant_score (Default)：对于更少的 terms 匹配使用 constant_score_boolean，否则简化为 a bit set 位图
constant_score_boolean：给每个文档分配等于 boost 参数的相关性得分该方法将原始query转为一个bool query，该bool query 包含一个should子句和每个匹配关键的 term query该方法可能导致最终的 bool query 会超过 indices.query.bool.max_clause_count 设置的子句限制，若超过限制es返回error
scoring_boolean：为每个匹配文档计算相关性得分，cpu比constant_score_boolean消耗高，其他规则类似 constant_score_boolean
top_terms_blended_freqs_N
top_terms_boost_N
top_terms_N：保留前N个关键词
推荐使用 constant_score, constant_score_boolean, or top_terms_boost_N

elasticsearch查询term等级（query查询）相关推荐

计算机二级学校查询,计算机等级考试查询系统
随着高校生源的增加,学生激增,管理难度也越来越大,优化日常管理成为难题.而信息化技术的推进,各行业都在不断改善服务,提高效率,教育行业的管理者也希望提高管理水平,完善各种工作的进程.目前校园信息化逐渐 ...
Hibernate——Query查询
原文地址前两篇文章介绍了获取SessionFactory,Session,以及Session的三种状态及其之间的转换,本文势必要将大家最关心的问题做出介绍,也就是大家最关心的通过Hibernate提 ...
java term_[ElasticSearch]Java API 之词条查询（Term Level Query）
1. 词条查询(Term Query) 词条查询是ElasticSearch的一个简单查询.它仅匹配在给定字段中含有该词条的文档,而且是确切的.未经分析的词条.term 查询会查找我们设定的准确值. ...
ElasticSearch term和match查询机制解析和隐藏的查询问题
2. 关于默认分析使用term查询的问题之前说过es的默认分析器会讲中文拆分成一个个的单个汉子,搜索条件"内科"会被分析为"内"和"科", ...
Elasticsearch 避免term对text字段使用查询
Elasticsearch 避免term对text字段使用查询起源: 使用term查询Elasticserach中province字段为北京市的文档.term查询对text字段使用,结果为空. # ...
elasticsearch基本查询二（英文分词）term和terms查询
term和terms查询(查找zhaoliu这个人的信息) term query会去倒排索弓|中寻找确切的term,它并不知道分词器的存在.这种查询适合keyword.numeric. date. t ...
ElasticSearch 基于 Term 的查询
Term 的重要性 Term 是表达语音的最小单位,搜索和利用统计语言模型进行自然语言处理都需要处理 Term 特点 Term Level Query : Term Query / Range Que ...
ES查询term的用法
1.term 的用法 term检索,如果content分词后含有中国这个token,就会检索到 curl -XPOST http://192.168.1.101:9200/index/fulltext ...
elasticsearch的多索引联合查询以及范围日期查询示例
一.前言首先,博主这边要用ES来代替传统的mysql操作,那么原来的多表联合查询操作自然也要转换为多索引联合查找.这里使用elasticsearch-php库来操作ES,原生的ES也是大同小异的. ...

elasticsearch查询term等级（query查询）

文章目录

es下term级别的查询语句

term-level queries 用于query或filter上下文

exists 存在性查询

fuzzy 查询

ids 批量id查询

prefix 前缀查询

range 范围查询

regexp 正则查询

term 精确匹配

terms 多值精确匹配

terms lookup 查询

terms_set 多值集合精确匹配

wildcard 通配符匹配

通用参数

rewrite

elasticsearch查询term等级（query查询）相关推荐

最新文章

热门文章