SpringBoot+Es7.6.1+Jsoup+Vue+Docker打造古诗词实时搜索功能

文章目录

服务安装
- 下载安装elasticsearch
- 安装elasticsearch head插件
- 新建索引
- 安装Kibana
- 安装ik分词器
ElasticSearch基本操作
- 操作说明
- 常用操作
- - 默认字段类型
  - 指定字段类型（定义索引规则）
- 查询
- - 普通查询
  - 按条件查询
  - 查询指定字段
  - 排序
  - 分页
  - 多条件查询
  - 范围查询
  - 高亮显示
- 修改
- 删除
SpringBoot集成ES
- 引入maven依赖
- 新建ElasticSearch配置类
- 测试相关API
- - 创建测试类
  - 创建index
  - 判断索引是否存在
  - 删除索引
  - 创建文档
  - 批量创建文档
  - 判断文档是否存在
  - 获取文档
  - 更新文档
  - 删除文档
  - 搜索
使用Jsoup爬取网页数据写入ES
- 引入maven依赖
- 新建解析Html工具类
- 分析网页
- 爬取的数据写入到es
前端使用Vue.js完成搜索功能
- 引入js
- 页面编写
- 后端编写
最终效果

服务安装

下载安装elasticsearch

Docker 安装 elasticsearch:7.6.1

docker pull elasticsearch:7.6.1

mkdir -p /Users/szcl/mydata/elasticsearch/configmkdir -p /Users/szcl/mydata/elasticsearch/dataecho "http.host: 0.0.0.0" >> /Users/szcl/mydata/elasticsearch/config/elasticsearch.ymlchmod -R 777 /Users/szcl/mydata/elasticsearch/docker run --name elasticsearch -p 9200:9200 -p 9300:9300  -e "discovery.type=single-node" -e ES_JAVA_OPTS="-Xms64m -Xmx128m" -v /Users/szcl/mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /Users/szcl/mydata/elasticsearch/data:/usr/share/elasticsearch/data -v /Users/szcl/mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins -d elasticsearch:7.6.1

查看是否启动成功

docker ps -a

如果未启动成功，通过以下命令查看日志：

docker logs -f b016c22606e1

访问服务器的9200端口：

安装elasticsearch head插件

docker pull mobz/elasticsearch-head:5docker run -d -p 9100:9100 docker.io/mobz/elasticsearch-head:5

启动成功后访问：

刚安装的话可能存在跨域拒绝访问问题，需要修改配置，有两种方式：

直接修改elasticsearch外挂的配置

cd /mydata/elasticsearch/configvim elasticsearch.yml

在配置中新增

http.cors.enabled: true
http.cors.allow-origin: "*"

重启容器

docker restart b016c22606e1

进入容器修改配置

docker exec -it b016c22606e1 /bin/bashcd ./configvim elasticsearch.yml

在配置中新增

http.cors.enabled: true
http.cors.allow-origin: "*"

重启容器

docker restart b016c22606e1

新建索引

发现点OK时，没有反应，查看控制台

发现返回406错误代码，点进去查看详情

发现不支持x-www-form-urlencoded

解决方法：

进入head容器
```
docker exec -it 62c5c56241ae /bin/bash
```
进入_site文件夹
编辑vendor.js
```
vim vendor.js
```
- 把容器的文件copy到宿主机中编辑
  
  参考：https://blog.csdn.net/zhaoyajie1011/article/details/98610002
- 安装vim
```
apt-get update
apt-get install vim
```

修改内容

contentType: "application/x-www-form-urlencoded
修改为：
contentType: "application/json;charset=UTF-8"

var inspectData = s.contentType === "application/x-www-form-urlencoded"
修改为：
var inspectData = s.contentType === "application/json;charset=UTF-8"

重启容器

这时候创建成功了！但是head这个插件主要用来数据展示，不适合做些复杂查询，我们做查询最好安装功能更强大的Kibana

安装Kibana

Docker 安装
```
docker pull kibana:7.6.1
```

启动镜像

docker run --name kibana -e ELASTICSEARCH_HOSTS=http://IP:9200 -p 5601:5601 -d kibana:7.6.1

修改配置

这里我把容器中的文件copy到宿主机上进行修改

docker cp 970f63f0babb:/usr/share/kibana/config/kibana.yml /mydata/kibana/config/

直接在宿主机编辑

vim kibana.yml

修改以下内容：

server.name: kibana
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: [ "http://IP:9200" ]
i18n.locale: "zh-CN"
xpack.monitoring.ui.container.elasticsearch.enabled: true

把修改好的配置copy到容器中

docker cp /mydata/kibana/config/kibana.yml 970f63f0babb:/usr/share/kibana/config/

重启容器
```
docker restart 970f63f0babb
```
浏览器访问5601端口

安装ik分词器

进入elasticsearch容器
```
docker exec -it 98d725e6291e /bin/bash
```

安装

elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.1/elasticsearch-analysis-ik-7.6.1.zip

重启所有容器
测试分词效果
- 打开kibana控制台http://localhost:5601/
- 侧边栏找到Dev Tools
- 测试ik_max_word(最细粒度拆分)
```
POST _analyze
{"analyzer": "ik_max_word","text": "中国共产党"
}
```
- 测试ik_smart（最少切分）
```
POST _analyze
{"analyzer": "ik_smart","text": "中国共产党"
}
```
自定义分词
- 比如我要对“我爱赵亚杰”进行分词，不管是ik_smart 还是 ik_max_word，都会把名字拆分成单个字
- 这时候就需要用到自定义分词，进入容器，找到ik分词器的配置
```
exec -it 98d725e6291e /bin/bashcd config/analysis-ik/vi IKAnalyzer.cfg.xml
```
  在<entry key="ext_dict"></entry>中配置自己的分词字典
```
<entry key="ext_dict">my.dic</entry>
```
  保存，新建my.dic词典
```
vi my.dic
```
  my.dic中输入赵亚杰三个字，保存
- 重启elasticsearch容器
- 测试自定义分词效果

ElasticSearch基本操作

操作说明

操作	method	URL地址
创建文档（指定文档ID）	PUT	localhost:9200/索引名称/类型名称/文档ID
创建文档（随机文档ID）	POST	localhost:9200/索引名称/类型名称
修改文档	POST	localhost:9200/索引名称/类型名称/文档ID/_update
删除文档	DELETE	localhost:9200/索引名称/类型名称/文档ID
查看文档（通过文档ID）	GET	localhost:9200/索引名称/类型名称/文档ID
查询所有数据	POST	localhost:9200/索引名称/类型名称/_search

常用操作

查看健康状态

GET _cat/health

1599732945 10:15:45 elasticsearch yellow 1 1 5 5 0 0 2 0 - 71.4%

查看_cat里包含哪些东西

GET _cat/indices

yellow open poem                     tWco8rUWQCS1YuMtkrCl4A 1 1  1 0  5.1kb  5.1kb
green  open .kibana_task_manager_1   1SxsVdvgSZOOQ3X9wKXJzQ 1 0  2 1 16.2kb 16.2kb
yellow open poem2                    xWMF79GYTaKco1Ljo2SmrA 1 1  0 0   283b   283b
green  open .apm-agent-configuration UGRU7tD0Tj-bOnmo-nfZrw 1 0  0 0   283b   283b
green  open .kibana_1                r65DwNYWSha1v7AW5v62QQ 1 0 20 6   48kb   48kb

…

通过_cat可以查看很多信息

###创建索引

默认字段类型

PUT /poem/poem/1
{"title": "相思","author": "王维","content": "红豆生南国，春来发几枝。愿君多采撷，此物最相思。"
}

执行

使用elasticsearch head插件查看index

通过数据浏览查看文档内容

指定字段类型（定义索引规则）

PUT /poem2
{"mappings": {"properties": {"title": {"type": "text"},"date": {"type": "date"},"content": {"type": "text"}}}
}

使用head插件查看

查询

普通查询

GET /poem/_doc/1
或
GET /poem/poem/1

查询index为poem，_doc是默认的type，在elasticsearch8.x后，type会被淘汰，1是id为1的内容

{"_index" : "poem","_type" : "_doc","_id" : "1","_version" : 1,"_seq_no" : 0,"_primary_term" : 1,"found" : true,"_source" : {"title" : "相思","author" : "王维","content" : "红豆生南国，春来发几枝。愿君多采撷，此物最相思。"}
}

按条件查询

content包含“一”的：

GET /poem/_search?q=content:一

{"took" : 7,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.74386525,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : 0.74386525,"_source" : {"title" : "登鹳雀楼","author" : "王之涣","content" : "白日依山尽，黄河入海流。欲穷千里目，更上一层楼。"}},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : 0.6489038,"_source" : {"title" : "九月九日忆山东兄弟","author" : "王维","content" : "独在异乡为异客，每逢佳节倍思亲。遥知兄弟登高处，遍插茱萸少一人。"}}]}
}

这里是否是模糊查询，取决于定义index的时候，字段的类型，如果是text类型，那么将会被分词，如果为keyword类型，将不会被分词。

查询指定字段

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content"]
}

match会使用分词器解析

{"took" : 7,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.74386525,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : 0.74386525,"_source" : {"title" : "登鹳雀楼","content" : "白日依山尽，黄河入海流。欲穷千里目，更上一层楼。"}},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : 0.6489038,"_source" : {"title" : "九月九日忆山东兄弟","content" : "独在异乡为异客，每逢佳节倍思亲。遥知兄弟登高处，遍插茱萸少一人。"}}]}
}

排序

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content","date"],"sort": [{"date": {"order": "asc"}}]
}

{"took" : 5,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : null,"_source" : {"date" : "2020-09-10","title" : "登鹳雀楼","content" : "白日依山尽，黄河入海流。欲穷千里目，更上一层楼。"},"sort" : [1599696000000]},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : null,"_source" : {"date" : "2020-09-11","title" : "九月九日忆山东兄弟","content" : "独在异乡为异客，每逢佳节倍思亲。遥知兄弟登高处，遍插茱萸少一人。"},"sort" : [1599782400000]}]}
}

分页

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content","date"],"sort": [{"date": {"order": "asc"}}],"from": 0,"size": 1
}

from: 从多少条开始查询；

size：查询条数

多条件查询

GET /poem/poem/_search
{"query": {"bool": {"must": [{"match": {"author": "王维"}},{"match": {"date": "2020-09-11"}}]}}
}
或
GET /poem/poem/_search
{"query": {"bool": {"should": [{"match": {"author": "王维"}},{"match": {"date": "2020-09-12"}}]}}
}

must 相当于mysql的and

must_not 相当于mysql的not

should 相当于mysql的or

匹配多条件查询，多个词用空格分开

GET /poem/poem/_search
{"query": {"match": {"content": "三 一"}}
}

范围查询

GET /poem/poem/_search
{"query": {"bool": {"must": [{"match": {"author": "王维"}}],"filter": {"range": {"index": {"gte": 1,"lt": 3}}}}}
}

gt 大于； gte大于等于；lt小于；lte小于等于

高亮显示

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"highlight": {"pre_tags": "<span style='color: red'>","post_tags": "</span>","fields": {"content": {}}}
}

使用highlight关键字

修改

POST /poem/_doc/1/_update
{"doc": {"date": "2020-09-10"}
}

{"_index" : "poem","_type" : "_doc","_id" : "1","_version" : 2,"_seq_no" : 1,"_primary_term" : 1,"found" : true,"_source" : {"title" : "相思","author" : "王维","content" : "红豆生南国，春来发几枝。愿君多采撷，此物最相思。","date" : "2020-09-10"}
}

每次修改version都会自增

删除

DELETE /poem2/_doc/1（删除指定文档）
或
DELETE /poem2（删除index）

{"_index" : "poem2","_type" : "_doc","_id" : "1","_version" : 3,"result" : "not_found","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 3,"_primary_term" : 1
}

{"acknowledged" : true
}

通过GET _cat/indices查看所有的index

yellow open poem                     tWco8rUWQCS1YuMtkrCl4A 1 1  1 0 12.2kb 12.2kb
green  open .kibana_task_manager_1   1SxsVdvgSZOOQ3X9wKXJzQ 1 0  2 1 16.2kb 16.2kb
green  open .apm-agent-configuration UGRU7tD0Tj-bOnmo-nfZrw 1 0  0 0   283b   283b
green  open .kibana_1                r65DwNYWSha1v7AW5v62QQ 1 0 23 3 73.7kb 73.7kb

发现poem2已经被删掉了

SpringBoot集成ES

官方文档：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.6/java-rest-high-document-index.html

引入maven依赖

<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

不指定版本有可能引入的和实际使用的版本不一致

<properties><java.version>1.8</java.version><elasticsearch.version>7.6.1</elasticsearch.version>
</properties>

新建ElasticSearch配置类

package com.youngj.es.config;import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;/*** ElasticSearch配置文件* @author YoungJ*/
@Configuration
public class ElasticSearchClientConfig {@Beanpublic RestHighLevelClient restHighLevelClient() {RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("127.0.0.1", 9200, "http")));return client;}
}

测试相关API

创建测试类

package com.youngj.es.api;import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;@SpringBootTest
class EsApiApplicationTests {private static final String INDEX = "youngj_poem";@Autowiredprivate RestHighLevelClient restHighLevelClient;@Testvoid contextLoads() {}
}

创建index

@Test
void testCreateIndex() throws IOException {CreateIndexRequest request = new CreateIndexRequest(INDEX);CreateIndexResponse indexResponse = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);System.out.println(indexResponse);
}

判断索引是否存在

/*** 判断索引是否存在* @throws IOException*/
@Test
void getIndex() throws IOException {GetIndexRequest request = new GetIndexRequest(INDEX);boolean exists = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);System.out.println(exists);
}

删除索引

/*** 删除索引* @throws IOException*/
@Test
void delIndex() throws IOException {DeleteIndexRequest request = new DeleteIndexRequest(INDEX);AcknowledgedResponse response = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT);System.out.println(response.isAcknowledged());
}

创建文档

/*** 创建文档* @throws IOException*/
@Test
void addDoc() throws IOException {IndexRequest request = new IndexRequest(INDEX);request.id("1");request.timeout(TimeValue.timeValueSeconds(1));request.source(JSON.toJSONString(new Poem("行宫", "元稹", "寥落古行宫，宫花寂寞红。白头宫女在，闲坐说玄宗。")), XContentType.JSON);IndexResponse indexResponse = restHighLevelClient.index(request, RequestOptions.DEFAULT);System.out.println(indexResponse);System.out.println(indexResponse.status());
}

IndexResponse[index=youngj_poem,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
CREATED

批量创建文档

/*** 批量创建文档* @throws IOException*/
@Test
void addBatchDoc() throws IOException {BulkRequest request = new BulkRequest(INDEX);request.timeout(TimeValue.timeValueSeconds(10));List<Poem> list = new ArrayList<>();list.add(new Poem("行宫", "元稹", "寥落古行宫，宫花寂寞红。白头宫女在，闲坐说玄宗。"));list.add(new Poem("新嫁娘词", "王建", "三日入厨下，洗手作羹汤。未谙姑食性，先遣小姑尝。"));list.add(new Poem("相思", "王维", "红豆生南国，春来发几枝。愿君多采撷，此物最相思。"));list.add(new Poem("杂诗三首·其二", "王维", "君自故乡来，应知故乡事。来日绮窗前，寒梅著花未？"));list.add(new Poem("鹿柴", "王维", "空山不见人，但闻人语响。返景入深林，复照青苔上。"));list.add(new Poem("芙蓉楼送辛渐", "王昌龄", "寒雨连江夜入吴，平明送客楚山孤。洛阳亲友如相问，一片冰心在玉壶。"));list.add(new Poem("江雪", "柳宗元", "千山鸟飞绝，万径人踪灭。孤舟蓑笠翁，独钓寒江雪。"));for (int i = 0; i < list.size(); i++) {request.add(new IndexRequest(INDEX).id((i+2)+"").source(JSON.toJSONString(list.get(i)), XContentType.JSON));}BulkResponse bulk = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);System.out.println(bulk.status());System.out.println(bulk.hasFailures());
}

判断文档是否存在

/*** 判断文档是否存在* @throws IOException*/
@Test
void chkDocExist() throws IOException {GetRequest request = new GetRequest(INDEX);request.id("1");boolean exists = restHighLevelClient.exists(request, RequestOptions.DEFAULT);System.out.println(exists);
}

获取文档

/*** 获取文档* @throws IOException*/
@Test
void getDoc() throws IOException {GetRequest request = new GetRequest(INDEX);request.id("1");GetResponse documentFields = restHighLevelClient.get(request, RequestOptions.DEFAULT);System.out.println(JSON.toJSONString(documentFields.getSource()));
}

结果：

{"author":"元稹","title":"行宫","content":"寥落古行宫，宫花寂寞红。白头宫女在，闲坐说玄宗。"}

更新文档

/*** 更新文档* @throws IOException*/
@Test
void updateDoc() throws IOException {UpdateRequest request = new UpdateRequest(INDEX, "1");request.timeout(TimeValue.timeValueSeconds(1));Poem poem = new Poem("登鹳雀楼", "王之涣", "白日依山尽，黄河入海流。欲穷千里目，更上一层楼。");request.doc(JSON.toJSONString(poem), XContentType.JSON);UpdateResponse updateResponse = restHighLevelClient.update(request, RequestOptions.DEFAULT);System.out.println(JSON.toJSONString(updateResponse.status()));System.out.println(updateResponse.getGetResult());
}

删除文档

/*** 删除文档* @throws IOException*/
@Test
void delDoc() throws IOException {DeleteRequest request = new DeleteRequest(INDEX, "2");DeleteResponse deleteResponse = restHighLevelClient.delete(request, RequestOptions.DEFAULT);System.out.println(deleteResponse.status());
}

搜索

/*** 搜索* @throws IOException*/
@Test
void search() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("content", "三");SearchSourceBuilder query = sourceBuilder.query(matchQueryBuilder);request.source(query);SearchResponse search = restHighLevelClient.search(request, RequestOptions.DEFAULT);System.out.println(search.status());System.out.println(JSON.toJSONString(search));
}

QueryBuilders 构建查询条件

使用Jsoup爬取网页数据写入ES

引入maven依赖

<dependency><groupId>org.jsoup</groupId><artifactId>jsoup</artifactId><version>1.10.2</version>
</dependency>

新建解析Html工具类

public class HtmlParseUtil {}

分析网页

审查元素，找到唐诗的主体部分，找到对应的html标签

我们发现，class=”sons“的标签下面有7个div，分别对应着无言绝句、七言绝句…

我们点开第一个div，也就是五言绝句，发现里面的span标签对应着五言绝句的标题，里面是个a标签，点击跳转到诗词体

思路：

通过循环所有的class="typecont"的div，拿到span标签下的a标签的链接，请求后拿到诗词体

public void parseHtml() throws Exception {// 最外层的URLString wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";// 使用Jsoup.parse，把HTML结果解析成Document对象，我们可以像js那样使用里面的方法Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");String title = span.getElementsByTag("a").eq(0).text();System.out.println("title: " + title + ", src: " + src);}}
}public static void main(String[] args) throws Exception {new HtmlParseUtil().parseHtml();
}

拿到了链接之后，我们点进去分析诗词体的HTML

通过查看元素我们发现，在class="cont"里面，

h1的内容是标题，

标签里面的第二个标签内容是作者，

的内容是诗词内容，这个id是contson和url中的内容拼接

public void parseHtml() throws Exception {// 最外层的URLString wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";// 使用Jsoup.parse，把HTML结果解析成Document对象，我们可以像js那样使用里面的方法Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");// 请求每一个URL，得到诗词体Document sonDoc = Jsoup.parse(new URL(src), 50000);// 获取url中的ID，下面获取诗词体的时候用得到String id = src.substring(src.indexOf("_")+1, src.indexOf(".aspx"));Element body = sonDoc.getElementById("sonsyuanwen");Element cont = body.getElementsByClass("cont").get(0);String title = cont.getElementsByTag("h1").eq(0).text();String author = cont.getElementsByTag("p").get(0).getElementsByTag("a").eq(1).text();String content = cont.getElementById("contson" + id).text();System.out.println("title: " + title + ", author: " + author + ", content: " + content);}}
}

爬取的数据写入到es

public List<Poem> parseHtml() throws Exception {String wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");List<Poem> poems = new ArrayList<>();for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");Document sonDoc = Jsoup.parse(new URL(src), 50000);String id = src.substring(src.indexOf("_")+1, src.indexOf(".aspx"));Element body = sonDoc.getElementById("sonsyuanwen");Element cont = body.getElementsByClass("cont").get(0);String title = cont.getElementsByTag("h1").eq(0).text();String author = cont.getElementsByTag("p").get(0).getElementsByTag("a").eq(1).text();String content = cont.getElementById("contson" + id).text();poems.add(new Poem(title, author, content));}}return poems;
}

使用es的批量插入方法，将数据写入到es

@Test
void insertHtmlParser() throws Exception {BulkRequest request = new BulkRequest(INDEX);request.timeout(TimeValue.timeValueSeconds(100));List<Poem> poems = new HtmlParseUtil().parseHtml();for (Poem poem : poems) {request.add(new IndexRequest(INDEX).source(JSON.toJSONString(poem), XContentType.JSON));}BulkResponse bulk = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);System.out.println(bulk.hasFailures());
}

前端使用Vue.js完成搜索功能

引入js

axios.min.js （网络交互）
vue.min.js

页面编写

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"><head><meta charset="utf-8"/><title>古诗词搜索</title><link rel="stylesheet" th:href="@{/static/css/style.css}"/></head><body class="pg">
<div class="page" id="app"><div id="mallPage" class=" mallist tmall- page-not-market "><div id="header" class=" header-list-app"><div class="headerLayout"><div class="headerCon "><div class="header-extra"><!--搜索--><div id="mallSearch" class="mall-search"><form name="searchTop" class="mallSearch-form clearfix"><fieldset><legend>搜索</legend><div class="mallSearch-input clearfix"><div class="s-combobox" id="s-combobox-685"><div class="s-combobox-input-wrap"><input v-model="keyword" type="text" autocomplete="off" value="dd"id="mq"class="s-combobox-input" aria-haspopup="true"></div></div><button @click.prevent="searchKey" type="submit" id="searchbtn">搜索</button></div></fieldset></form></div></div></div></div></div><div id="content"><div class="main"><div class="view"><div class="product" v-for="result in results"><div class="product-iWrap"><div style="text-align: center">{{result.title}}</div><div style="text-align: center">{{result.author}}</div><div style="text-align: left" v-html="result.content"></div></div></div></div></div></div></div>
</div><script th:src="@{/static/js/vue.min.js}"></script>
<script th:src="@{/static/js/axios.min.js}"></script>
<script>new Vue({el: '#app',data: {keyword: '',results: []},methods: {searchKey() {var keyword = this.keyword;console.log(keyword)axios.get("/search/"+keyword+"/1/100").then(res => {console.log(res.data)this.results = res.data;});}}});
</script></body>
</html>

后端编写

IndexController

package com.youngj.es.controller;import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;/*** description：** @author YoungJ* @date 2020-09-12 15:15*/
@Controller
public class IndexController {@GetMapping({"/", "/index"})public String index() {return "index";}
}

SearchController

package com.youngj.es.controller;import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;/*** description：** @author YoungJ* @date 2020-09-12 15:46*/
@RestController
public class SearchController {private static final String INDEX = "poem";@Autowiredprivate RestHighLevelClient restHighLevelClient;@GetMapping("/search/{keyword}/{pageNo}/{pageSize}")public List<Map<String, Object>> search(@PathVariable("keyword") String keyword,@PathVariable("pageNo") int pageNo,@PathVariable("pageSize") int pageSize) throws Exception {SearchRequest searchRequest = new SearchRequest(INDEX);SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();HighlightBuilder highlightBuilder = new HighlightBuilder().requireFieldMatch(false).field("content").preTags("<span style='color: red'>").postTags("</span>");sourceBuilder.highlighter(highlightBuilder);// 分页sourceBuilder.from(pageNo);sourceBuilder.size(pageSize);TermQueryBuilder termQueryBuilder = new TermQueryBuilder("content", keyword);sourceBuilder.query(termQueryBuilder);sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));searchRequest.source(sourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);List<Map<String, Object>> list = new ArrayList<>();for (SearchHit hit : searchResponse.getHits().getHits()) {Map<String, Object> sourceAsMap = hit.getSourceAsMap();Map<String, HighlightField> highlightFields = hit.getHighlightFields();HighlightField content = highlightFields.get("content");if (content != null) {Text[] fragments = content.getFragments();String newCon = "";for (Text text : fragments) {newCon += text;}sourceAsMap.put("content", newCon);}list.add(sourceAsMap);}return list;}
}

最终效果

搜索古诗词中含有”白“的

搜索古诗词中含有夜的

SpringBoot+Es7.6.1+Jsoup+Vue+Docker打造古诗词实时搜索功能相关推荐

vue实现一个类似浏览器搜索功能(ctrl + f)
目录引言一.介绍自己项目的需求二.先说说我的数据怎么设置的三.具体功能的实现思路: 1.点击左侧目录跳转到对应位置 2.滚动到相应位置左侧目录树的对应标题变蓝色 3.搜索功能 4.目录展开和收 ...
vue中实现input实时搜索
实现效果 html <el-input v-model="listQuery.tQuery" @input="searchEvent" clearable ...
vue 实现边输入边搜索功能
效果图: 搜索分类2种情况,一般的是当用户输入完,点击确定的按钮在向后发送请求,还有一种就是的我一边输入,一边向后台发送请求,但是会产生一个性能的问题,就是一直发请求造成页面的卡顿,这里就是使用截流函 ...
Vue.js 打造酷炫的可视化数据大屏
可视化技术与 Vue 介绍实验介绍在本节实验中,将对可视化技术的应用场景.发展历程进行介绍,让大家对可视化技术有一个基础的概念.随后将介绍如今流行的可视化框架与其之间的优缺点对比.最后介绍 Vue ...
SpringBoot 2.3.x 分层构建 Docker 镜像实践
目录[-] . 一.什么是镜像分层 . 二.SpringBoot 2.3.x 新增对分层的支持 . 三.创建测试的 SpringBoot 应用 . 1.Maven 中引入相关依赖和插件 . 2.创建测 ...
SpringBoot获取Ip并解析地址,Docker部署 (ip2region.xdb)
这里写自定义目录标题 SpringBoot获取Ip并解析地址,Docker部署 (ip2region.xdb) 流程 1. 引入依赖 2.获取IP工具类 3.解析IP地址工具类 SpringBoot获 ...
Springboot - 用SpringBoot 2.3.0.M1创建Docker映像
Springboot - 用SpringBoot 2.3.0.M1创建Docker映像) 1.发布 2.说明 3.常见的Docker 运行方式 4.常规方式通过docker 运行springboot ...
springBoot前后端不分离Vue+elementUI脚手架
新建SpringBoot2.4.4项目. 目录结构如下新建html页面 login.js文件接口完整项目在我的资源中SpringBoot前后端不分离vue+element脚手架_springbo ...
vue做混合式app_Vue Cordova教程-Vue+Cordova打造跨平台可安装的混合APP视频教程（大地）...
Vue+Cordova打造跨平台可安装的混合APP视频教程必看说明: 目前购买此教程送Html5+Cordova+Ionic智能电视(TV)应用开发教程视频教程: 购买过Ionic的同学可以直接在( ...

SpringBoot+Es7.6.1+Jsoup+Vue+Docker打造古诗词实时搜索功能

文章目录

服务安装

下载安装elasticsearch

安装elasticsearch head插件

新建索引

安装Kibana

安装ik分词器

ElasticSearch基本操作

操作说明

常用操作

默认字段类型

指定字段类型（定义索引规则）

查询

普通查询

按条件查询

查询指定字段

排序

分页

多条件查询

范围查询

高亮显示

修改

删除

SpringBoot集成ES

引入maven依赖

新建ElasticSearch配置类

测试相关API

创建测试类

创建index

判断索引是否存在

删除索引

创建文档

批量创建文档

判断文档是否存在

获取文档

更新文档

删除文档

搜索

使用Jsoup爬取网页数据写入ES

引入maven依赖

新建解析Html工具类

分析网页

爬取的数据写入到es

前端使用Vue.js完成搜索功能

引入js

页面编写

后端编写

最终效果

SpringBoot+Es7.6.1+Jsoup+Vue+Docker打造古诗词实时搜索功能相关推荐

最新文章

热门文章