Elasticsearch 之(33)document数据建模实战_文件搜索_嵌套关系_父子/祖孙关系数据...
前言
在《Elasticsearch 之(2)Elasticsearch核心概念》中简单提到了document 和 数据库db 数据模型的差别,本文将详细讲述集中常用的数据模型。文件搜索数据建模,对类似文件系统这种的有多层级关系的数据进行建模
1、文件系统数据构造
PUT /fs
{"settings": {"analysis": {"analyzer": {"paths": { "tokenizer": "path_hierarchy"}}}}
}
path_hierarchy tokenizer讲解
PUT /fs/_mapping/file
{"properties": {"name": { "type": "keyword"},"path": { "type": "keyword","fields": {"tree": { "type": "text","analyzer": "paths"}}}}
}
PUT /fs/file/1
{"name": "README.txt", "path": "/workspace/projects/helloworld", "contents": "这是我的第一个elasticsearch程序"
}
GET /fs/file/_search
{"query": {"bool": {"must": [{"match": {"contents": "elasticsearch"}},{"constant_score": {"filter": {"term": {"path": "/workspace/projects/helloworld"}}}}]}}
}
{"took": 2,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1.284885,"hits": [{"_index": "fs","_type": "file","_id": "1","_score": 1.284885,"_source": {"name": "README.txt","path": "/workspace/projects/helloworld","contents": "这是我的第一个elasticsearch程序"}}]}
}
搜索需求2:搜索/workspace目录下,内容包含elasticsearch的所有的文件
GET /fs/file/_search
{"query": {"bool": {"must": [{"match": {"contents": "elasticsearch"}},{"constant_score": {"filter": {"term": {"path.tree": "/workspace"}}}}]}}
}
嵌套关系
PUT /website/blogs/6
{"title": "花无缺发表的一篇帖子","content": "我是花无缺,大家要不要考虑一下投资房产和买股票的事情啊。。。","tags": [ "投资", "理财" ],"comments": [ {"name": "小鱼儿","comment": "什么股票啊?推荐一下呗","age": 28,"stars": 4,"date": "2016-09-01"},{"name": "黄药师","comment": "我喜欢投资房产,风,险大收益也大","age": 31,"stars": 5,"date": "2016-10-22"}]
}
GET /website/blogs/_search
{"query": {"bool": {"must": [{ "match": { "comments.name": "黄药师" }},{ "match": { "comments.age": 28 }} ]}}
}
{"took": 102,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1.8022683,"hits": [{"_index": "website","_type": "blogs","_id": "6","_score": 1.8022683,"_source": {"title": "花无缺发表的一篇帖子","content": "我是花无缺,大家要不要考虑一下投资房产和买股票的事情啊。。。","tags": ["投资","理财"],"comments": [{"name": "小鱼儿","comment": "什么股票啊?推荐一下呗","age": 28,"stars": 4,"date": "2016-09-01"},{"name": "黄药师","comment": "我喜欢投资房产,风,险大收益也大","age": 31,"stars": 5,"date": "2016-10-22"}]}}]}
}
{"title": [ "花无缺", "发表", "一篇", "帖子" ],"content": [ "我", "是", "花无缺", "大家", "要不要", "考虑", "一下", "投资", "房产", "买", "股票", "事情" ],"tags": [ "投资", "理财" ],"comments.name": [ "小鱼儿", "黄药师" ],"comments.comment": [ "什么", "股票", "推荐", "我", "喜欢", "投资", "房产", "风险", "收益", "大" ],"comments.age": [ 28, 31 ],"comments.stars": [ 4, 5 ],"comments.date": [ 2016-09-01, 2016-10-22 ]
}
PUT /website
{"mappings": {"blogs": {"properties": {"comments": {"type": "nested", "properties": {"name": { "type": "string" },"comment": { "type": "string" },"age": { "type": "short" },"stars": { "type": "short" },"date": { "type": "date" }}}}}}
}
{ "comments.name": [ "小鱼儿" ],"comments.comment": [ "什么", "股票", "推荐" ],"comments.age": [ 28 ],"comments.stars": [ 4 ],"comments.date": [ 2014-09-01 ]
}
{ "comments.name": [ "黄药师" ],"comments.comment": [ "我", "喜欢", "投资", "房产", "风险", "收益", "大" ],"comments.age": [ 31 ],"comments.stars": [ 5 ],"comments.date": [ 2014-10-22 ]
}
{ "title": [ "花无缺", "发表", "一篇", "帖子" ],"body": [ "我", "是", "花无缺", "大家", "要不要", "考虑", "一下", "投资", "房产", "买", "股票", "事情" ],"tags": [ "投资", "理财" ]
}
GET /website/blogs/_search
{"query": {"bool": {"must": [{"match": {"title": "花无缺"}},{"nested": {"path": "comments","score_mode": "max";"query": {"bool": {"must": [{"match": {"comments.name": "黄药师"}},{"match": {"comments.age": 28}}]}}}}]}}
}
GET /website/blogs/_search
{"size": 0, "aggs": {"comments_path": {"nested": {"path": "comments"}, "aggs": {"group_by_comments_date": {"date_histogram": {"field": "comments.date","interval": "month","format": "yyyy-MM"},"aggs": {"avg_stars": {"avg": {"field": "comments.stars"}}}}}}}
}
{"took": 52,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 2,"max_score": 0,"hits": []},"aggregations": {"comments_path": {"doc_count": 4,"group_by_comments_date": {"buckets": [{"key_as_string": "2016-08","key": 1470009600000,"doc_count": 1,"avg_stars": {"value": 3}},{"key_as_string": "2016-09","key": 1472688000000,"doc_count": 2,"avg_stars": {"value": 4.5}},{"key_as_string": "2016-10","key": 1475280000000,"doc_count": 1,"avg_stars": {"value": 5}}]}}}
}
当根据nested object类型聚合下钻时候,可以用过reverse_path, 获取其他object field进行下钻。
GET /website/blogs/_search
{"size": 0,"aggs": {"comments_path": {"nested": {"path": "comments"},"aggs": {"group_by_comments_age": {"histogram": {"field": "comments.age","interval": 10},"aggs": {"reverse_path": {"reverse_nested": {}, "aggs": {"group_by_tags": {"terms": {"field": "tags.keyword"}}}}}}}}}
}
{"took": 5,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 2,"max_score": 0,"hits": []},"aggregations": {"comments_path": {"doc_count": 4,"group_by_comments_age": {"buckets": [{"key": 20,"doc_count": 1,"reverse_path": {"doc_count": 1,"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "投资","doc_count": 1},{"key": "理财","doc_count": 1}]}}},{"key": 30,"doc_count": 3,"reverse_path": {"doc_count": 2,"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "大侠","doc_count": 1},{"key": "投资","doc_count": 1},{"key": "理财","doc_count": 1},{"key": "练功","doc_count": 1}]}}}]}}}
}
父子关系
PUT /company
{"mappings": {"rd_center": {},"employee": {"_parent": {"type": "rd_center" }}}
}
POST /company/rd_center/_bulk
{ "index": { "_id": "1" }}
{ "name": "北京研发总部", "city": "北京", "country": "中国" }
{ "index": { "_id": "2" }}
{ "name": "上海研发中心", "city": "上海", "country": "中国" }
{ "index": { "_id": "3" }}
{ "name": "硅谷人工智能实验室", "city": "硅谷", "country": "美国" }
PUT /company/employee/1?parent=1
{"name": "张三","birthday": "1970-10-24","hobby": "爬山"
}
POST /company/employee/_bulk
{ "index": { "_id": 2, "parent": "1" }}
{ "name": "李四", "birthday": "1982-05-16", "hobby": "游泳" }
{ "index": { "_id": 3, "parent": "2" }}
{ "name": "王二", "birthday": "1979-04-01", "hobby": "爬山" }
{ "index": { "_id": 4, "parent": "3" }}
{ "name": "赵五", "birthday": "1987-05-11", "hobby": "骑马" }
GET /company/rd_center/_search
{"query": {"has_child": {"type": "employee","query": {"range": {"birthday": {"gte": "1980-01-01"}}}}}
}
{"took": 33,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 2,"max_score": 1,"hits": [{"_index": "company","_type": "rd_center","_id": "1","_score": 1,"_source": {"name": "北京研发总部","city": "北京","country": "中国"}},{"_index": "company","_type": "rd_center","_id": "3","_score": 1,"_source": {"name": "硅谷人工智能实验室","city": "硅谷","country": "美国"}}]}
}
GET /company/rd_center/_search
{"query": {"has_child": {"type": "employee","query": {"match": {"name": "张三"}}}}
}
{"took": 2,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1,"hits": [{"_index": "company","_type": "rd_center","_id": "1","_score": 1,"_source": {"name": "北京研发总部","city": "北京","country": "中国"}}]}
}
GET /company/rd_center/_search
{"query": {"has_child": {"type": "employee","min_children": 2, "query": {"match_all": {}}}}
}
{"took": 5,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1,"hits": [{"_index": "company","_type": "rd_center","_id": "1","_score": 1,"_source": {"name": "北京研发总部","city": "北京","country": "中国"}}]}
}
GET /company/employee/_search
{"query": {"has_parent": {"parent_type": "rd_center","query": {"term": {"country.keyword": "中国"}}}}
}
{"took": 5,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 1,"hits": [{"_index": "company","_type": "employee","_id": "3","_score": 1,"_routing": "2","_parent": "2","_source": {"name": "王二","birthday": "1979-04-01","hobby": "爬山"}},{"_index": "company","_type": "employee","_id": "1","_score": 1,"_routing": "1","_parent": "1","_source": {"name": "张三","birthday": "1970-10-24","hobby": "爬山"}},{"_index": "company","_type": "employee","_id": "2","_score": 1,"_routing": "1","_parent": "1","_source": {"name": "李四","birthday": "1982-05-16","hobby": "游泳"}}]}
}
5、统计每个国家的喜欢每种爱好的员工有多少个
GET /company/rd_center/_search
{"size": 0,"aggs": {"group_by_country": {"terms": {"field": "country.keyword"},"aggs": {"group_by_child_employee": {"children": {"type": "employee"},"aggs": {"group_by_hobby": {"terms": {"field": "hobby.keyword"}}}}}}}
}
{"took": 15,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0,"hits": []},"aggregations": {"group_by_country": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "中国","doc_count": 2,"group_by_child_employee": {"doc_count": 3,"group_by_hobby": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "爬山","doc_count": 2},{"key": "游泳","doc_count": 1}]}}},{"key": "美国","doc_count": 1,"group_by_child_employee": {"doc_count": 1,"group_by_hobby": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "骑马","doc_count": 1}]}}}]}}
}
父子关系,祖孙三层关系的数据建模,搜索
PUT /company
{"mappings": {"country": {},"rd_center": {"_parent": {"type": "country" }},"employee": {"_parent": {"type": "rd_center" }}}
}
POST /company/country/_bulk
{ "index": { "_id": "1" }}
{ "name": "中国" }
{ "index": { "_id": "2" }}
{ "name": "美国" }
POST /company/rd_center/_bulk
{ "index": { "_id": "1", "parent": "1" }}
{ "name": "北京研发总部" }
{ "index": { "_id": "2", "parent": "1" }}
{ "name": "上海研发中心" }
{ "index": { "_id": "3", "parent": "2" }}
{ "name": "硅谷人工智能实验室" }
PUT /company/employee/1?parent=1&routing=1
{"name": "张三","dob": "1970-10-24","hobby": "爬山"
}
GET /company/country/_search
{"query": {"has_child": {"type": "rd_center","query": {"has_child": {"type": "employee","query": {"match": {"hobby": "爬山"}}}}}}
}
{"took": 10,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 1,"max_score": 1,"hits": [{"_index": "company","_type": "country","_id": "1","_score": 1,"_source": {"name": "中国"}}]}
}
转载于:https://www.cnblogs.com/wuzhiwei549/p/9113457.html
Elasticsearch 之(33)document数据建模实战_文件搜索_嵌套关系_父子/祖孙关系数据...相关推荐
- 数据建模实战:方寸之间玩转购物篮分析
购物篮分析是零售行业里非常重要经典的一个模型,曾经被大家津津乐道的啤酒与尿布的故事,相信大家都还记忆犹新,这个故事很好地诠释了商品关联性对销售额的提升作用,时至今日,仍有很强的现实指导意义.这种通过研 ...
- 数据建模实战,Smartbi带你玩转购物篮分析
购物篮分析是一个非常重要的模型,关于啤酒与尿布的故事,这个故事很好地解释了商品关联性的作用,时至今日,仍有很强的现实指导意义.这种数据,将不同商品关联起来,并挖掘二者之间联系的分析方法,就叫作&quo ...
- 局域网传文件_文件搜索神器Everything使用系列教程之——文件互传篇
本文接上篇 文件搜索神器Everything使用系列教程之--搜索篇. 众所周知,Everything是一款文件搜索软件,它如何做文件互传呢? 别看Everything小巧,它竟然内置了FTP服务器和 ...
- python保存模型 drop_(长期更新)【python数据建模实战】零零散散问题及解决方案梳理...
注1:本文旨在梳理汇总出我们在建模过程中遇到的零碎小问题及解决方案(即当作一份答疑文档),会不定期更新,不断完善, 也欢迎大家提问,我会填写进来. 注2:感谢阅读.为方便您查找想要问题的答案,可以就本 ...
- mfc创建excel如何另存为_mfc表格数据保存为excel文件-VC (MFC)如何从对话框写数据到Excel...
我现在把Excel表格嵌入到MFC单文档界面,然后对嵌... 1.首先,打开媒介工具"记事本",将word文件里需要导入的数据,复制粘贴到记事本当中,然后保存成为txt文件,本例中 ...
- linux笔记_文件搜索命令
一.locate命令 locate命令属于mlocate包,如果执行locate filename提示命令未找到执行安装mlocate包 # yum -y install mlocate 安装后执行l ...
- java 泛型 父子_使用通配符和泛型:完成父子类关系的List对象的类型匹配
泛型和通配符 使用泛型和通配符都可以让一个方法所表示的算法逻辑适应多种类型. Java中具备继承关系的类A.B(A extends B)它们的集合List和List之间是没有继承关系的, 可以使用泛型 ...
- 干货 | Elasticsearch 数据建模指南
0.题记 我在做 Elasticsearch 相关咨询和培训过程中,发现大家普遍更关注实战中涉及的问题,下面我选取几个常见且典型的问题,和大家一起分析一下. 订单表.账单表父子文档可以实现类似 SQL ...
- 向《数据科学实战》作者Cathy O'Neil提问!
Cathy O'Neil是约翰逊实验室高级数据科学家.哈佛大学数学博士.麻省理工学院数学系博士后.巴纳德学院教授,曾发表过大量算术代数几何方面的论文.他曾在著名的全球投资管理公司D.E. Shaw担任 ...
- 如何用开源组件“攒”出一个大数据建模平台?
写在前面:博主是一只经过实战开发历练后投身培训事业的"小山猪",昵称取自动画片<狮子王>中的"彭彭",总是以乐观.积极的心态对待周边的事物.本人的技 ...
最新文章
- Spring 实践 -IoC
- 【译】使用Kotlin和RxJava测试MVP架构的完整示例 - 第1部分
- 124 Binary Tree Maximum Path Sum
- 如何清除SQL数据库日志,清除后对数据库有什么影响
- c语言建立线性表(顺序储存,链式储存,循环,双向)全
- 无根树转为有根数(图论) By ACReaper
- 大牛深入讲解!最经典的HashMap图文详解
- [剑指offer]面试题第[66]题[构建乘积数组][Leetcode][JAVA][第238题][除自身以外数组的乘积][数组]
- 数组多重筛选条件排序方法
- 在c#中使用全局快捷键
- 队列 句子分析 精辟的诠释 有图片
- 三绕组变压器参数计算matlab,三绕组变压器等值参数计算
- 计算机u盘能直接拨出吗,电脑怎么直接拔出U盘而不丢失数据|电脑可以不用弹出设备直接拔出U盘吗...
- 金仓数据库KingbaseES的连接方法
- 【反思】写在腾讯电话面试之后
- 南航里程每年清空吗_南航里程即将大幅贬值!此期限前使用仍能保值
- 数值型数据的表示(3.0)
- java 线程安全和不安全
- 市场新格局,分享购商业模式异军突起
- AD模数转化/DA数模转换
热门文章
- 系统学习机器学习之参数方法(二)
- 2018-CBAM论文讲解
- 通过高速计算机网络和多媒体,全国2014.10办公自动化原理及应用试题
- python中plt定义,对Python中plt的画图函数详解
- Zephyr_Bindings目录作用
- 热修复 阿里的AndFix
- 国内pip源提示“not a trusted or secure host”解决方案
- POJ 3126 Prime Path 简单广搜(BFS)
- java 虚拟机--新生代与老年代GC [转]
- 在VC6.0中使用GDI+的两种办法