文章目录

  • 服务安装
    • 下载安装elasticsearch
    • 安装elasticsearch head插件
    • 新建索引
    • 安装Kibana
    • 安装ik分词器
  • ElasticSearch基本操作
    • 操作说明
    • 常用操作
      • 默认字段类型
      • 指定字段类型(定义索引规则)
    • 查询
      • 普通查询
      • 按条件查询
      • 查询指定字段
      • 排序
      • 分页
      • 多条件查询
      • 范围查询
      • 高亮显示
    • 修改
    • 删除
  • SpringBoot集成ES
    • 引入maven依赖
    • 新建ElasticSearch配置类
    • 测试相关API
      • 创建测试类
      • 创建index
      • 判断索引是否存在
      • 删除索引
      • 创建文档
      • 批量创建文档
      • 判断文档是否存在
      • 获取文档
      • 更新文档
      • 删除文档
      • 搜索
  • 使用Jsoup爬取网页数据写入ES
    • 引入maven依赖
    • 新建解析Html工具类
    • 分析网页
    • 爬取的数据写入到es
  • 前端使用Vue.js完成搜索功能
    • 引入js
    • 页面编写
    • 后端编写
  • 最终效果

服务安装

下载安装elasticsearch

Docker 安装 elasticsearch:7.6.1

docker pull elasticsearch:7.6.1

mkdir -p /Users/szcl/mydata/elasticsearch/configmkdir -p /Users/szcl/mydata/elasticsearch/dataecho "http.host: 0.0.0.0" >> /Users/szcl/mydata/elasticsearch/config/elasticsearch.ymlchmod -R 777 /Users/szcl/mydata/elasticsearch/docker run --name elasticsearch -p 9200:9200 -p 9300:9300  -e "discovery.type=single-node" -e ES_JAVA_OPTS="-Xms64m -Xmx128m" -v /Users/szcl/mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /Users/szcl/mydata/elasticsearch/data:/usr/share/elasticsearch/data -v /Users/szcl/mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins -d elasticsearch:7.6.1

查看是否启动成功

docker ps -a

如果未启动成功,通过以下命令查看日志:

docker logs -f b016c22606e1

访问服务器的9200端口:

安装elasticsearch head插件

docker pull mobz/elasticsearch-head:5docker run -d -p 9100:9100 docker.io/mobz/elasticsearch-head:5

启动成功后访问:

刚安装的话可能存在跨域拒绝访问问题,需要修改配置,有两种方式:

  • 直接修改elasticsearch外挂的配置

    cd /mydata/elasticsearch/configvim elasticsearch.yml
    

    在配置中新增

    http.cors.enabled: true
    http.cors.allow-origin: "*"
    

    重启容器

    docker restart b016c22606e1
    
  • 进入容器修改配置

    docker exec -it b016c22606e1 /bin/bashcd ./configvim elasticsearch.yml
    

    在配置中新增

    http.cors.enabled: true
    http.cors.allow-origin: "*"
    

    重启容器

    docker restart b016c22606e1
    

新建索引

发现点OK时,没有反应,查看控制台

发现返回406错误代码,点进去查看详情

发现不支持x-www-form-urlencoded

解决方法:

  • 进入head容器

    docker exec -it 62c5c56241ae /bin/bash
    
  • 进入_site文件夹

  • 编辑vendor.js

    vim vendor.js
    

    • 把容器的文件copy到宿主机中编辑

      参考:https://blog.csdn.net/zhaoyajie1011/article/details/98610002

    • 安装vim

      apt-get update
      apt-get install vim
      
  • 修改内容

    contentType: "application/x-www-form-urlencoded
    修改为:
    contentType: "application/json;charset=UTF-8"
    
    var inspectData = s.contentType === "application/x-www-form-urlencoded"
    修改为:
    var inspectData = s.contentType === "application/json;charset=UTF-8"
    
  • 重启容器

这时候创建成功了!但是head这个插件主要用来数据展示,不适合做些复杂查询,我们做查询最好安装功能更强大的Kibana

安装Kibana

  • Docker 安装

    docker pull kibana:7.6.1
    
  • 启动镜像

    docker run --name kibana -e ELASTICSEARCH_HOSTS=http://IP:9200 -p 5601:5601 -d kibana:7.6.1
    
  • 修改配置

    这里我把容器中的文件copy到宿主机上进行修改

    docker cp 970f63f0babb:/usr/share/kibana/config/kibana.yml /mydata/kibana/config/
    

    直接在宿主机编辑

    vim kibana.yml
    

    修改以下内容:

    server.name: kibana
    server.port: 5601
    server.host: "0.0.0.0"
    elasticsearch.hosts: [ "http://IP:9200" ]
    i18n.locale: "zh-CN"
    xpack.monitoring.ui.container.elasticsearch.enabled: true
    

    把修改好的配置copy到容器中

    docker cp /mydata/kibana/config/kibana.yml 970f63f0babb:/usr/share/kibana/config/
    
  • 重启容器

    docker restart 970f63f0babb
    
  • 浏览器访问5601端口

安装ik分词器

  • 进入elasticsearch容器

    docker exec -it 98d725e6291e /bin/bash
    
  • 安装

    elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.1/elasticsearch-analysis-ik-7.6.1.zip
    

  • 重启所有容器

  • 测试分词效果

    • 打开kibana控制台http://localhost:5601/

    • 侧边栏找到Dev Tools

    • 测试ik_max_word(最细粒度拆分)

      POST _analyze
      {"analyzer": "ik_max_word","text": "中国共产党"
      }
      

    • 测试ik_smart(最少切分)

      POST _analyze
      {"analyzer": "ik_smart","text": "中国共产党"
      }
      

  • 自定义分词

    • 比如我要对“我爱赵亚杰”进行分词,不管是ik_smart 还是 ik_max_word,都会把名字拆分成单个字

    • 这时候就需要用到自定义分词,进入容器,找到ik分词器的配置

      exec -it 98d725e6291e /bin/bashcd config/analysis-ik/vi IKAnalyzer.cfg.xml
      

      <entry key="ext_dict"></entry>中配置自己的分词字典

      <entry key="ext_dict">my.dic</entry>
      

      保存,新建my.dic词典

      vi my.dic
      

      my.dic中输入赵亚杰三个字,保存

    • 重启elasticsearch容器

    • 测试自定义分词效果

ElasticSearch基本操作

操作说明

操作 method URL地址
创建文档(指定文档ID) PUT localhost:9200/索引名称/类型名称/文档ID
创建文档(随机文档ID) POST localhost:9200/索引名称/类型名称
修改文档 POST localhost:9200/索引名称/类型名称/文档ID/_update
删除文档 DELETE localhost:9200/索引名称/类型名称/文档ID
查看文档(通过文档ID) GET localhost:9200/索引名称/类型名称/文档ID
查询所有数据 POST localhost:9200/索引名称/类型名称/_search

常用操作

  • 查看健康状态

    GET _cat/health
    
    1599732945 10:15:45 elasticsearch yellow 1 1 5 5 0 0 2 0 - 71.4%
    
  • 查看_cat里包含哪些东西

    GET _cat/indices
    
    yellow open poem                     tWco8rUWQCS1YuMtkrCl4A 1 1  1 0  5.1kb  5.1kb
    green  open .kibana_task_manager_1   1SxsVdvgSZOOQ3X9wKXJzQ 1 0  2 1 16.2kb 16.2kb
    yellow open poem2                    xWMF79GYTaKco1Ljo2SmrA 1 1  0 0   283b   283b
    green  open .apm-agent-configuration UGRU7tD0Tj-bOnmo-nfZrw 1 0  0 0   283b   283b
    green  open .kibana_1                r65DwNYWSha1v7AW5v62QQ 1 0 20 6   48kb   48kb
    

    通过_cat可以查看很多信息

###创建索引

默认字段类型

PUT /poem/poem/1
{"title": "相思","author": "王维","content": "红豆生南国,春来发几枝。愿君多采撷,此物最相思。"
}

执行

使用elasticsearch head插件查看index

通过数据浏览查看文档内容

指定字段类型(定义索引规则)

PUT /poem2
{"mappings": {"properties": {"title": {"type": "text"},"date": {"type": "date"},"content": {"type": "text"}}}
}

使用head插件查看

查询

普通查询

GET /poem/_doc/1
或
GET /poem/poem/1

查询index为poem,_doc是默认的type,在elasticsearch8.x后,type会被淘汰,1是id为1的内容

{"_index" : "poem","_type" : "_doc","_id" : "1","_version" : 1,"_seq_no" : 0,"_primary_term" : 1,"found" : true,"_source" : {"title" : "相思","author" : "王维","content" : "红豆生南国,春来发几枝。愿君多采撷,此物最相思。"}
}

按条件查询

content包含“一”的:

GET /poem/_search?q=content:一
{"took" : 7,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.74386525,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : 0.74386525,"_source" : {"title" : "登鹳雀楼","author" : "王之涣","content" : "白日依山尽,黄河入海流。欲穷千里目,更上一层楼。"}},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : 0.6489038,"_source" : {"title" : "九月九日忆山东兄弟","author" : "王维","content" : "独在异乡为异客,每逢佳节倍思亲。遥知兄弟登高处,遍插茱萸少一人。"}}]}
}

这里是否是模糊查询,取决于定义index的时候,字段的类型,如果是text类型,那么将会被分词,如果为keyword类型,将不会被分词。

查询指定字段

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content"]
}

match会使用分词器解析

{"took" : 7,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : 0.74386525,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : 0.74386525,"_source" : {"title" : "登鹳雀楼","content" : "白日依山尽,黄河入海流。欲穷千里目,更上一层楼。"}},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : 0.6489038,"_source" : {"title" : "九月九日忆山东兄弟","content" : "独在异乡为异客,每逢佳节倍思亲。遥知兄弟登高处,遍插茱萸少一人。"}}]}
}

排序

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content","date"],"sort": [{"date": {"order": "asc"}}]
}
{"took" : 5,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "poem","_type" : "poem","_id" : "2","_score" : null,"_source" : {"date" : "2020-09-10","title" : "登鹳雀楼","content" : "白日依山尽,黄河入海流。欲穷千里目,更上一层楼。"},"sort" : [1599696000000]},{"_index" : "poem","_type" : "poem","_id" : "3","_score" : null,"_source" : {"date" : "2020-09-11","title" : "九月九日忆山东兄弟","content" : "独在异乡为异客,每逢佳节倍思亲。遥知兄弟登高处,遍插茱萸少一人。"},"sort" : [1599782400000]}]}
}

分页

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"_source": ["title", "content","date"],"sort": [{"date": {"order": "asc"}}],"from": 0,"size": 1
}

from: 从多少条开始查询;

size:查询条数

多条件查询

GET /poem/poem/_search
{"query": {"bool": {"must": [{"match": {"author": "王维"}},{"match": {"date": "2020-09-11"}}]}}
}
或
GET /poem/poem/_search
{"query": {"bool": {"should": [{"match": {"author": "王维"}},{"match": {"date": "2020-09-12"}}]}}
}

must 相当于mysql的and

must_not 相当于mysql的not

should 相当于mysql的or

匹配多条件查询,多个词用空格分开

GET /poem/poem/_search
{"query": {"match": {"content": "三 一"}}
}

范围查询

GET /poem/poem/_search
{"query": {"bool": {"must": [{"match": {"author": "王维"}}],"filter": {"range": {"index": {"gte": 1,"lt": 3}}}}}
}

gt 大于; gte大于等于;lt小于;lte小于等于

高亮显示

GET /poem/poem/_search
{"query": {"match": {"content": "一"}},"highlight": {"pre_tags": "<span style='color: red'>","post_tags": "</span>","fields": {"content": {}}}
}

使用highlight关键字

修改

POST /poem/_doc/1/_update
{"doc": {"date": "2020-09-10"}
}
{"_index" : "poem","_type" : "_doc","_id" : "1","_version" : 2,"_seq_no" : 1,"_primary_term" : 1,"found" : true,"_source" : {"title" : "相思","author" : "王维","content" : "红豆生南国,春来发几枝。愿君多采撷,此物最相思。","date" : "2020-09-10"}
}

每次修改version都会自增

删除

DELETE /poem2/_doc/1(删除指定文档)
或
DELETE /poem2(删除index)
{"_index" : "poem2","_type" : "_doc","_id" : "1","_version" : 3,"result" : "not_found","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 3,"_primary_term" : 1
}
{"acknowledged" : true
}

通过GET _cat/indices查看所有的index

yellow open poem                     tWco8rUWQCS1YuMtkrCl4A 1 1  1 0 12.2kb 12.2kb
green  open .kibana_task_manager_1   1SxsVdvgSZOOQ3X9wKXJzQ 1 0  2 1 16.2kb 16.2kb
green  open .apm-agent-configuration UGRU7tD0Tj-bOnmo-nfZrw 1 0  0 0   283b   283b
green  open .kibana_1                r65DwNYWSha1v7AW5v62QQ 1 0 23 3 73.7kb 73.7kb

发现poem2已经被删掉了

SpringBoot集成ES

官方文档:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.6/java-rest-high-document-index.html

引入maven依赖

<dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

不指定版本有可能引入的和实际使用的版本不一致

<properties><java.version>1.8</java.version><elasticsearch.version>7.6.1</elasticsearch.version>
</properties>

新建ElasticSearch配置类

package com.youngj.es.config;import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;/*** ElasticSearch配置文件* @author YoungJ*/
@Configuration
public class ElasticSearchClientConfig {@Beanpublic RestHighLevelClient restHighLevelClient() {RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("127.0.0.1", 9200, "http")));return client;}
}

测试相关API

创建测试类

package com.youngj.es.api;import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;import java.io.IOException;@SpringBootTest
class EsApiApplicationTests {private static final String INDEX = "youngj_poem";@Autowiredprivate RestHighLevelClient restHighLevelClient;@Testvoid contextLoads() {}
}

创建index

@Test
void testCreateIndex() throws IOException {CreateIndexRequest request = new CreateIndexRequest(INDEX);CreateIndexResponse indexResponse = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);System.out.println(indexResponse);
}

判断索引是否存在

/*** 判断索引是否存在* @throws IOException*/
@Test
void getIndex() throws IOException {GetIndexRequest request = new GetIndexRequest(INDEX);boolean exists = restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);System.out.println(exists);
}

删除索引

/*** 删除索引* @throws IOException*/
@Test
void delIndex() throws IOException {DeleteIndexRequest request = new DeleteIndexRequest(INDEX);AcknowledgedResponse response = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT);System.out.println(response.isAcknowledged());
}

创建文档

/*** 创建文档* @throws IOException*/
@Test
void addDoc() throws IOException {IndexRequest request = new IndexRequest(INDEX);request.id("1");request.timeout(TimeValue.timeValueSeconds(1));request.source(JSON.toJSONString(new Poem("行宫", "元稹", "寥落古行宫,宫花寂寞红。白头宫女在,闲坐说玄宗。")), XContentType.JSON);IndexResponse indexResponse = restHighLevelClient.index(request, RequestOptions.DEFAULT);System.out.println(indexResponse);System.out.println(indexResponse.status());
}
IndexResponse[index=youngj_poem,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
CREATED

批量创建文档

/*** 批量创建文档* @throws IOException*/
@Test
void addBatchDoc() throws IOException {BulkRequest request = new BulkRequest(INDEX);request.timeout(TimeValue.timeValueSeconds(10));List<Poem> list = new ArrayList<>();list.add(new Poem("行宫", "元稹", "寥落古行宫,宫花寂寞红。白头宫女在,闲坐说玄宗。"));list.add(new Poem("新嫁娘词", "王建", "三日入厨下,洗手作羹汤。未谙姑食性,先遣小姑尝。"));list.add(new Poem("相思", "王维", "红豆生南国,春来发几枝。愿君多采撷,此物最相思。"));list.add(new Poem("杂诗三首·其二", "王维", "君自故乡来,应知故乡事。来日绮窗前,寒梅著花未?"));list.add(new Poem("鹿柴", "王维", "空山不见人,但闻人语响。返景入深林,复照青苔上。"));list.add(new Poem("芙蓉楼送辛渐", "王昌龄", "寒雨连江夜入吴,平明送客楚山孤。洛阳亲友如相问,一片冰心在玉壶。"));list.add(new Poem("江雪", "柳宗元", "千山鸟飞绝,万径人踪灭。孤舟蓑笠翁,独钓寒江雪。"));for (int i = 0; i < list.size(); i++) {request.add(new IndexRequest(INDEX).id((i+2)+"").source(JSON.toJSONString(list.get(i)), XContentType.JSON));}BulkResponse bulk = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);System.out.println(bulk.status());System.out.println(bulk.hasFailures());
}

判断文档是否存在

/*** 判断文档是否存在* @throws IOException*/
@Test
void chkDocExist() throws IOException {GetRequest request = new GetRequest(INDEX);request.id("1");boolean exists = restHighLevelClient.exists(request, RequestOptions.DEFAULT);System.out.println(exists);
}

获取文档

/*** 获取文档* @throws IOException*/
@Test
void getDoc() throws IOException {GetRequest request = new GetRequest(INDEX);request.id("1");GetResponse documentFields = restHighLevelClient.get(request, RequestOptions.DEFAULT);System.out.println(JSON.toJSONString(documentFields.getSource()));
}

结果:

{"author":"元稹","title":"行宫","content":"寥落古行宫,宫花寂寞红。白头宫女在,闲坐说玄宗。"}

更新文档

/*** 更新文档* @throws IOException*/
@Test
void updateDoc() throws IOException {UpdateRequest request = new UpdateRequest(INDEX, "1");request.timeout(TimeValue.timeValueSeconds(1));Poem poem = new Poem("登鹳雀楼", "王之涣", "白日依山尽,黄河入海流。欲穷千里目,更上一层楼。");request.doc(JSON.toJSONString(poem), XContentType.JSON);UpdateResponse updateResponse = restHighLevelClient.update(request, RequestOptions.DEFAULT);System.out.println(JSON.toJSONString(updateResponse.status()));System.out.println(updateResponse.getGetResult());
}

删除文档

/*** 删除文档* @throws IOException*/
@Test
void delDoc() throws IOException {DeleteRequest request = new DeleteRequest(INDEX, "2");DeleteResponse deleteResponse = restHighLevelClient.delete(request, RequestOptions.DEFAULT);System.out.println(deleteResponse.status());
}

搜索

/*** 搜索* @throws IOException*/
@Test
void search() throws IOException {SearchRequest request = new SearchRequest(INDEX);SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("content", "三");SearchSourceBuilder query = sourceBuilder.query(matchQueryBuilder);request.source(query);SearchResponse search = restHighLevelClient.search(request, RequestOptions.DEFAULT);System.out.println(search.status());System.out.println(JSON.toJSONString(search));
}

QueryBuilders 构建查询条件

使用Jsoup爬取网页数据写入ES

引入maven依赖

<dependency><groupId>org.jsoup</groupId><artifactId>jsoup</artifactId><version>1.10.2</version>
</dependency>

新建解析Html工具类

public class HtmlParseUtil {}

分析网页

审查元素,找到唐诗的主体部分,找到对应的html标签

我们发现,class=”sons“的标签下面有7个div,分别对应着无言绝句、七言绝句…

我们点开第一个div,也就是五言绝句,发现里面的span标签对应着五言绝句的标题,里面是个a标签,点击跳转到诗词体

思路:

通过循环所有的class="typecont"的div,拿到span标签下的a标签的链接,请求后拿到诗词体

public void parseHtml() throws Exception {// 最外层的URLString wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";// 使用Jsoup.parse,把HTML结果解析成Document对象,我们可以像js那样使用里面的方法Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");String title = span.getElementsByTag("a").eq(0).text();System.out.println("title: " + title + ", src: " + src);}}
}public static void main(String[] args) throws Exception {new HtmlParseUtil().parseHtml();
}

拿到了链接之后,我们点进去分析诗词体的HTML

通过查看元素我们发现,在class="cont"里面,

h1的内容是标题,

标签里面的第二个标签内容是作者,

的内容是诗词内容,这个id是contson和url中的内容拼接
public void parseHtml() throws Exception {// 最外层的URLString wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";// 使用Jsoup.parse,把HTML结果解析成Document对象,我们可以像js那样使用里面的方法Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");// 请求每一个URL,得到诗词体Document sonDoc = Jsoup.parse(new URL(src), 50000);// 获取url中的ID,下面获取诗词体的时候用得到String id = src.substring(src.indexOf("_")+1, src.indexOf(".aspx"));Element body = sonDoc.getElementById("sonsyuanwen");Element cont = body.getElementsByClass("cont").get(0);String title = cont.getElementsByTag("h1").eq(0).text();String author = cont.getElementsByTag("p").get(0).getElementsByTag("a").eq(1).text();String content = cont.getElementById("contson" + id).text();System.out.println("title: " + title + ", author: " + author + ", content: " + content);}}
}

爬取的数据写入到es

public List<Poem> parseHtml() throws Exception {String wrapUrl = "https://www.gushiwen.org/gushi/tangshi.aspx";Document document = Jsoup.parse(new URL(wrapUrl), 50000);Elements elements = document.getElementsByClass("typecont");List<Poem> poems = new ArrayList<>();for (int i = 0; i < elements.size(); i++) {Element element = elements.get(i);Elements spans = element.getElementsByTag("span");for (int j = 0; j < spans.size(); j++) {Element span = spans.get(j);String src = span.getElementsByTag("a").eq(0).attr("href");Document sonDoc = Jsoup.parse(new URL(src), 50000);String id = src.substring(src.indexOf("_")+1, src.indexOf(".aspx"));Element body = sonDoc.getElementById("sonsyuanwen");Element cont = body.getElementsByClass("cont").get(0);String title = cont.getElementsByTag("h1").eq(0).text();String author = cont.getElementsByTag("p").get(0).getElementsByTag("a").eq(1).text();String content = cont.getElementById("contson" + id).text();poems.add(new Poem(title, author, content));}}return poems;
}

使用es的批量插入方法,将数据写入到es

@Test
void insertHtmlParser() throws Exception {BulkRequest request = new BulkRequest(INDEX);request.timeout(TimeValue.timeValueSeconds(100));List<Poem> poems = new HtmlParseUtil().parseHtml();for (Poem poem : poems) {request.add(new IndexRequest(INDEX).source(JSON.toJSONString(poem), XContentType.JSON));}BulkResponse bulk = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);System.out.println(bulk.hasFailures());
}

前端使用Vue.js完成搜索功能

引入js

  • axios.min.js (网络交互)

  • vue.min.js

页面编写

<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"><head><meta charset="utf-8"/><title>古诗词搜索</title><link rel="stylesheet" th:href="@{/static/css/style.css}"/></head><body class="pg">
<div class="page" id="app"><div id="mallPage" class=" mallist tmall- page-not-market "><div id="header" class=" header-list-app"><div class="headerLayout"><div class="headerCon "><div class="header-extra"><!--搜索--><div id="mallSearch" class="mall-search"><form name="searchTop" class="mallSearch-form clearfix"><fieldset><legend>搜索</legend><div class="mallSearch-input clearfix"><div class="s-combobox" id="s-combobox-685"><div class="s-combobox-input-wrap"><input v-model="keyword" type="text" autocomplete="off" value="dd"id="mq"class="s-combobox-input" aria-haspopup="true"></div></div><button @click.prevent="searchKey" type="submit" id="searchbtn">搜索</button></div></fieldset></form></div></div></div></div></div><div id="content"><div class="main"><div class="view"><div class="product" v-for="result in results"><div class="product-iWrap"><div style="text-align: center">{{result.title}}</div><div style="text-align: center">{{result.author}}</div><div style="text-align: left" v-html="result.content"></div></div></div></div></div></div></div>
</div><script th:src="@{/static/js/vue.min.js}"></script>
<script th:src="@{/static/js/axios.min.js}"></script>
<script>new Vue({el: '#app',data: {keyword: '',results: []},methods: {searchKey() {var keyword = this.keyword;console.log(keyword)axios.get("/search/"+keyword+"/1/100").then(res => {console.log(res.data)this.results = res.data;});}}});
</script></body>
</html>

后端编写

IndexController

package com.youngj.es.controller;import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;/*** description:** @author YoungJ* @date 2020-09-12 15:15*/
@Controller
public class IndexController {@GetMapping({"/", "/index"})public String index() {return "index";}
}

SearchController

package com.youngj.es.controller;import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;/*** description:** @author YoungJ* @date 2020-09-12 15:46*/
@RestController
public class SearchController {private static final String INDEX = "poem";@Autowiredprivate RestHighLevelClient restHighLevelClient;@GetMapping("/search/{keyword}/{pageNo}/{pageSize}")public List<Map<String, Object>> search(@PathVariable("keyword") String keyword,@PathVariable("pageNo") int pageNo,@PathVariable("pageSize") int pageSize) throws Exception {SearchRequest searchRequest = new SearchRequest(INDEX);SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();HighlightBuilder highlightBuilder = new HighlightBuilder().requireFieldMatch(false).field("content").preTags("<span style='color: red'>").postTags("</span>");sourceBuilder.highlighter(highlightBuilder);// 分页sourceBuilder.from(pageNo);sourceBuilder.size(pageSize);TermQueryBuilder termQueryBuilder = new TermQueryBuilder("content", keyword);sourceBuilder.query(termQueryBuilder);sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));searchRequest.source(sourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);List<Map<String, Object>> list = new ArrayList<>();for (SearchHit hit : searchResponse.getHits().getHits()) {Map<String, Object> sourceAsMap = hit.getSourceAsMap();Map<String, HighlightField> highlightFields = hit.getHighlightFields();HighlightField content = highlightFields.get("content");if (content != null) {Text[] fragments = content.getFragments();String newCon = "";for (Text text : fragments) {newCon += text;}sourceAsMap.put("content", newCon);}list.add(sourceAsMap);}return list;}
}

最终效果

搜索古诗词中含有”白“的

搜索古诗词中含有夜的


SpringBoot+Es7.6.1+Jsoup+Vue+Docker打造古诗词实时搜索功能相关推荐

  1. vue实现一个类似浏览器搜索功能(ctrl + f)

    目录 引言 一.介绍自己项目的需求 二.先说说我的数据怎么设置的 三.具体功能的实现思路: 1.点击左侧目录跳转到对应位置 2.滚动到相应位置左侧目录树的对应标题变蓝色 3.搜索功能 4.目录展开和收 ...

  2. vue中实现input实时搜索

    实现效果 html <el-input v-model="listQuery.tQuery" @input="searchEvent" clearable ...

  3. vue 实现边输入边搜索功能

    效果图: 搜索分类2种情况,一般的是当用户输入完,点击确定的按钮在向后发送请求,还有一种就是的我一边输入,一边向后台发送请求,但是会产生一个性能的问题,就是一直发请求造成页面的卡顿,这里就是使用截流函 ...

  4. Vue.js 打造酷炫的可视化数据大屏

    可视化技术与 Vue 介绍 实验介绍 在本节实验中,将对可视化技术的应用场景.发展历程进行介绍,让大家对可视化技术有一个基础的概念.随后将介绍如今流行的可视化框架与其之间的优缺点对比.最后介绍 Vue ...

  5. SpringBoot 2.3.x 分层构建 Docker 镜像实践

    目录[-] . 一.什么是镜像分层 . 二.SpringBoot 2.3.x 新增对分层的支持 . 三.创建测试的 SpringBoot 应用 . 1.Maven 中引入相关依赖和插件 . 2.创建测 ...

  6. SpringBoot获取Ip并解析地址,Docker部署 (ip2region.xdb)

    这里写自定义目录标题 SpringBoot获取Ip并解析地址,Docker部署 (ip2region.xdb) 流程 1. 引入依赖 2.获取IP工具类 3.解析IP地址工具类 SpringBoot获 ...

  7. Springboot - 用SpringBoot 2.3.0.M1创建Docker映像

    Springboot - 用SpringBoot 2.3.0.M1创建Docker映像) 1.发布 2.说明 3.常见的Docker 运行方式 4.常规方式通过docker 运行springboot ...

  8. springBoot前后端不分离Vue+elementUI脚手架

    新建SpringBoot2.4.4项目. 目录结构如下 新建html页面 login.js文件 接口 完整项目在我的资源中SpringBoot前后端不分离vue+element脚手架_springbo ...

  9. vue做混合式app_Vue Cordova教程-Vue+Cordova打造跨平台可安装的混合APP视频教程(大地)...

    Vue+Cordova打造跨平台可安装的混合APP视频教程 必看说明: 目前购买此教程送Html5+Cordova+Ionic智能电视(TV)应用开发教程视频教程: 购买过Ionic的同学可以直接在( ...

最新文章

  1. RIA Service + dataformc操作例子
  2. 配对MPLS和SD-WAN是一个双赢的方案
  3. 苹果8如何设置锁屏无线网连接服务器,iPhone8屏幕解锁怎么设置?苹果iPhone8与8 Plus解锁四种方法...
  4. 把执行结果转成json对象报错_JSONObject获取值后为一个对象,将对象转为JSONObject时报错...
  5. Spring(二)scope、集合注入、自动装配、生命周期
  6. 字符串转换到double数组
  7. 建立Groovy开发环境
  8. thinkphp3.2 不同域名配置不同分组设置
  9. (转)CentOS 和 Ubuntu 下的网络配置
  10. matlab拟合分析画不出线,lsqcurvefit曲线拟合后,用polt函数画不出拟合的图形
  11. Java开发笔记(一百四十四)实现FXML对应的控制器
  12. WIN10杜比音效驱动安装[蓝奏云]
  13. 利用DroidCamX将手机摄像头打造成电脑高清摄像头
  14. 译OpenCms-10.5.3—— 1. 背景话题【Background topics】
  15. 交换机和路由器的基本命令
  16. 第七章、Tiny4412 U-BOOT移植七 DDR内存配置
  17. idle最好记的常用快捷键大全
  18. 毕业设计-基于大数据的电影爬取与可视化分析系统-python
  19. 一文带你搞定svg-icon的使用
  20. 毕设-基于LoRa的智能农业大棚(一)

热门文章

  1. (转)2017年12月宋华教授携IBM中国研究院、猪八戒网、中航信托、33复杂美共同论道智慧供应链金融...
  2. 正则表达式 密码 需包含字母数字特殊字符
  3. LuaForWindows(SciTE) 5.1 常见问题
  4. 用html写qq音乐,html+caa手写qq音乐
  5. matlab 子波,基于Matlab的Bark子波实现
  6. linux修改SSH密码的方法
  7. 文献阅读:SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples
  8. 青海vr消防模拟演练系统,满足了对多人群多场景下的培训需求
  9. asp毕业设计——基于asp+access的网上远程教育网设计与实现(毕业论文+程序源码)——网上远程教育网
  10. 仿知音漫客漫画APP--MVP模式