目录

部署ES

下载

Config

系统参数

启动

Verify

常规用法

创建Index

_cat

查看settings

删除index

Bulk导入数据

Search

SR 外表

测试1:分词

ES

SR


部署ES

  • 下载

    • wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.16.2-linux-x86_64.tar.gz
  • Config

sr@cs02:~/app/elasticsearch-7.16.2$grep -v ^# config/elasticsearch.yml
node.name: node-2
network.host: 172.26.194.185
cluster.initial_master_nodes: ["node-2"]
  • 系统参数

vm.max_map_count = 655360
sr@cs02:~/app/elasticsearch-7.16.2$sudo vim /etc/sysctl.conf
sr@cs02:~/app/elasticsearch-7.16.2$sudo sysctl -p
vm.swappiness = 0
kernel.sysrq = 1
net.ipv4.neigh.default.gc_stale_time = 120
net.ipv4.conf.all.rp_filter = 0
net.ipv4.conf.default.rp_filter = 0
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.lo.arp_announce = 2
net.ipv4.conf.all.arp_announce = 2
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_synack_retries = 2
vm.max_map_count = 655360
  • 启动

sr@cs02:~/app/elasticsearch-7.16.2$bin/elasticsearch -d
  • Verify

sr@cs02:~/app/elasticsearch-7.16.2$sudo netstat -lnpt |grep 9[2-3]00
tcp6       0      0 172.26.194.185:9200     :::*                    LISTEN      10442/java
tcp6       0      0 172.26.194.185:9300     :::*                    LISTEN      10442/java

常规用法

  • 创建Index

sr@cs02:~$curl -sH "Content-Type: application/json" -XPUT "cs02:9200/test"    | python -m json.tool
{"acknowledged": true,"index": "test","shards_acknowledged": true
}
  • _cat

sr@cs02:~$curl -s  "cs02:9200/_cat/indices?v"
health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   test  poJIPAAJQpW4hq8zXGAg3Q   1   1          0            0       226b           226b
  • 查看settings

sr@cs02:~$curl -sH "Content-Type: application/json" -XGET "cs02:9200/test/_settings/" | python -m json.tool
{"test": {"settings": {"index": {"creation_date": "1641910831087","number_of_replicas": "1","number_of_shards": "1","provided_name": "test","routing": {"allocation": {"include": {"_tier_preference": "data_content"}}},"uuid": "JQKoH7sKRmi34chGt1n1jg","version": {"created": "7160299"}}}}
}
  • 删除index

sr@cs02:~$curl -s -XDELETE "cs02:9200/test" |  python -m json.tool
{"acknowledged": true
}
  • Bulk导入数据

  • Note: 每行要换行

curl -XPOST "http://cs02:9200/_bulk" -H 'Content-Type: application/json' -d'
{"index":{"_index":"test"}}
{ "k1" : 1, "k2": "2022-01-01", "k3": "Trying out Elasticsearch", "k4": "Trying out Elasticsearch", "k5": 10.0}
{"index":{"_index":"test"}}
{ "k1" : 2, "k2": "2022-01-02", "k3": "Trying out StarRocks", "k4": "Trying out StarRocks", "k5": 20.0}
{"index":{"_index":"test"}}
{ "k1" : 3, "k2": "2022-01-03", "k3": "StarRocks On ES", "k4": "StarRocks On ES", "k5": 30.0}
{"index":{"_index":"test"}}
{ "k1" : 4, "k2": "2022-01-04", "k3": "StarRocks", "k4": "StarRocks", "k5": 40.0}
{"index":{"_index":"test"}}
{ "k1" : 5, "k2": "2022-01-05", "k3": "ES", "k4": "ES", "k5": 50.0}
'
  • Search

sr@cs02:~$curl -s -XGET cs02:9200/test/_search?pretty
{"took" : 667,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 5,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "test","_type" : "_doc","_id" : "uW6rSX4B8p-MWLQevhhA","_score" : 1.0,"_source" : {"k1" : 1,"k2" : "2022-01-01","k3" : "Trying out Elasticsearch","k4" : "Trying out Elasticsearch","k5" : 10.0}},{"_index" : "test","_type" : "_doc","_id" : "um6rSX4B8p-MWLQevhhA","_score" : 1.0,"_source" : {"k1" : 2,"k2" : "2022-01-02","k3" : "Trying out StarRocks","k4" : "Trying out StarRocks","k5" : 20.0}},{"_index" : "test","_type" : "_doc","_id" : "u26rSX4B8p-MWLQevhhA","_score" : 1.0,"_source" : {"k1" : 3,"k2" : "2022-01-03","k3" : "StarRocks On ES","k4" : "StarRocks On ES","k5" : 30.0}},{"_index" : "test","_type" : "_doc","_id" : "vG6rSX4B8p-MWLQevhhA","_score" : 1.0,"_source" : {"k1" : 4,"k2" : "2022-01-04","k3" : "StarRocks","k4" : "StarRocks","k5" : 40.0}},{"_index" : "test","_type" : "_doc","_id" : "vW6rSX4B8p-MWLQevhhA","_score" : 1.0,"_source" : {"k1" : 5,"k2" : "2022-01-05","k3" : "ES","k4" : "ES","k5" : 50.0}}]}
}

SR 外表

  • 测试1:分词

  • ES

  • Create index
curl -sH "Content-Type: application/json" -XPUT "cs02:9200/test"  -d'
{"mappings": {"properties": {"k1": {"type": "long"},"k2": {"type": "date"},"k3": {"type": "keyword"},"k4": {"analyzer": "standard","type": "text"},"k5": {"type": "float"}}},"settings": {"index": {"number_of_replicas": "0","number_of_shards": "1"}}
}
'
  • 数据导入ES
curl -XPOST "http://cs02:9200/_bulk" -H 'Content-Type: application/json' -d'
{"index":{"_index":"test"}}
{ "k1" : 1, "k2": "2022-01-01", "k3": "Trying out Elasticsearch", "k4": "Trying out Elasticsearch", "k5": 10.0}
{"index":{"_index":"test"}}
{ "k1" : 2, "k2": "2022-01-02", "k3": "Trying out StarRocks", "k4": "Trying out StarRocks", "k5": 20.0}
{"index":{"_index":"test"}}
{ "k1" : 3, "k2": "2022-01-03", "k3": "StarRocks On ES", "k4": "StarRocks On ES", "k5": 30.0}
{"index":{"_index":"test"}}
{ "k1" : 4, "k2": "2022-01-04", "k3": "StarRocks", "k4": "StarRocks", "k5": 40.0}
{"index":{"_index":"test"}}
{ "k1" : 5, "k2": "2022-01-05", "k3": "ES", "k4": "ES", "k5": 50.0}
'
  • SR

mysql -uroot -hcs01 -P 9013
USE simon;-- 对ES中字符串类型分词类型(text) fields 进行探测
CREATE EXTERNAL TABLE `soe_t1` (`k1` bigint(20) NULL COMMENT "",`k2` datetime NULL COMMENT "",`k3` varchar(20) NULL COMMENT "",`k4` varchar(100) NULL COMMENT "",`k5` float NULL COMMENT ""
) ENGINE=ELASTICSEARCH
COMMENT "ELASTICSEARCH"
PROPERTIES (
"hosts" = "cs02:9200",
"index" = "test",
"type" = "_doc",
"transport" = "http",
"enable_docvalue_scan" = "true",
"max_docvalue_fields" = "20",
"enable_keyword_sniff" = "true"
);
  • 需要稍等几秒,同步后再查询
MySQL [simon]> select * from soe_t1;
ERROR 1064 (HY000): EsTable metadata has not been synced, Try it later
MySQL [simon]> desc soe_t1;
+-------+--------------+------+-------+---------+-------+
| Field | Type         | Null | Key   | Default | Extra |
+-------+--------------+------+-------+---------+-------+
| k1    | BIGINT       | Yes  | true  | NULL    |       |
| k2    | DATETIME     | Yes  | true  | NULL    |       |
| k3    | VARCHAR(20)  | Yes  | true  | NULL    |       |
| k4    | VARCHAR(100) | Yes  | false | NULL    | NONE  |
| k5    | FLOAT        | Yes  | false | NULL    | NONE  |
+-------+--------------+------+-------+---------+-------+
5 rows in set (0.00 sec)mysql> select * from soe_t1;
+------+---------------------+--------------------------+--------------------------+------+
| k1   | k2                  | k3                       | k4                       | k5   |
+------+---------------------+--------------------------+--------------------------+------+
|    1 | 2022-01-01 00:00:00 | Trying out Elasticsearch | Trying out Elasticsearch |   10 |
|    2 | 2022-01-02 00:00:00 | Trying out StarRocks     | Trying out StarRocks     |   20 |
|    3 | 2022-01-03 00:00:00 | StarRocks On ES          | StarRocks On ES          |   30 |
|    4 | 2022-01-04 00:00:00 | StarRocks                | StarRocks                |   40 |
|    5 | 2022-01-05 00:00:00 | ES                       | ES                       |   50 |
+------+---------------------+--------------------------+--------------------------+------+
5 rows in set (0.01 sec)mysql> select * from soe_t1 where k5 > 30;
+------+---------------------+-----------+-----------+------+
| k1   | k2                  | k3        | k4        | k5   |
+------+---------------------+-----------+-----------+------+
|    4 | 2022-01-04 00:00:00 | StarRocks | StarRocks |   40 |
|    5 | 2022-01-05 00:00:00 | ES        | ES        |   50 |
+------+---------------------+-----------+-----------+------+
2 rows in set (0.01 sec)-- 非分词列,精确匹配
mysql> select * from soe_t1 where k3 = 'ES';
+------+---------------------+------+------+------+
| k1   | k2                  | k3   | k4   | k5   |
+------+---------------------+------+------+------+
|    5 | 2022-01-05 00:00:00 | ES   | ES   |   50 |
+------+---------------------+------+------+------+
1 row in set (0.38 sec)-- 分词类型(text),按小写分词
mysql> select * from soe_t1 where k4 = 'es';
+------+---------------------+-----------------+-----------------+------+
| k1   | k2                  | k3              | k4              | k5   |
+------+---------------------+-----------------+-----------------+------+
|    3 | 2022-01-03 00:00:00 | StarRocks On ES | StarRocks On ES |   30 |
|    5 | 2022-01-05 00:00:00 | ES              | ES              |   50 |
+------+---------------------+-----------------+-----------------+------+
2 rows in set (0.01 sec)mysql> select * from soe_t1 where k4 = 'starrocks';
+------+---------------------+----------------------+----------------------+------+
| k1   | k2                  | k3                   | k4                   | k5   |
+------+---------------------+----------------------+----------------------+------+
|    2 | 2022-01-02 00:00:00 | Trying out StarRocks | Trying out StarRocks |   20 |
|    3 | 2022-01-03 00:00:00 | StarRocks On ES      | StarRocks On ES      |   30 |
|    4 | 2022-01-04 00:00:00 | StarRocks            | StarRocks            |   40 |
+------+---------------------+----------------------+----------------------+------+
3 rows in set (0.01 sec)-- 标准分词器,按小写分词
mysql> select * from soe_t1 where k4 = 'ES';
Empty set (0.01 sec)-- esquery
mysql> select * from soe_t1 where esquery(k4, '{"match": {"k4": "es"}}');
+------+---------------------+-----------------+-----------------+------+
| k1   | k2                  | k3              | k4              | k5   |
+------+---------------------+-----------------+-----------------+------+
|    3 | 2022-01-03 00:00:00 | StarRocks On ES | StarRocks On ES |   30 |
|    5 | 2022-01-05 00:00:00 | ES              | ES              |   50 |
+------+---------------------+-----------------+-----------------+------+
2 rows in set (0.01 sec)

下一篇:

玩转StarRocks on ES-2-全文检索

StarRocks招聘:

招解决方案,DBA,数据库研发,测试,前后端开发等岗位,

有意者请投递简历到 hr@starrocks.com

用 StarRocks on ES 实现 分词相关推荐

  1. es ik分词插件安装

    es ik分词插件安装 1.ik下载(下载es对应版本的ik分词包) https://github.com/medcl/elasticsearch-analysis-ik/releases 2.解压下 ...

  2. ELK下es的分词器analyzer

    转载链接 :es的分词器analyzerhttps://www.cnblogs.com/xiaobaozi-95/p/9328948.html 中文分词器 在lunix下执行下列命令,可以看到本来应该 ...

  3. es拼音分词 大帅哥_SpringBoot集成Elasticsearch 进阶,实现中文、拼音分词,繁简体转换...

    Elasticsearch 分词 分词分为读时分词和写时分词. 读时分词发生在用户查询时,ES 会即时地对用户输入的关键词进行分词,分词结果只存在内存中,当查询结束时,分词结果也会随即消失.而写时分词 ...

  4. php es中文分词,Elasticsearch搜索中文分词优化

    Elasticsearch 中文搜索时遇到几个问题: 当搜索关键词如:"人民币"时,如果分词将"人民币"分成"人","民" ...

  5. ELK系列(十)、ES中文分词器IK插件安装和配置远程词库热加载

    简介 IK Analyzer是一个开源的,基于Java语言开发的轻量级的中文分词工具包: 最初,它是以开源项目Luence 为应用主体的,结合词典分词和文法分析算法的中文分词组件:从 3.0 版本开始 ...

  6. Elasticsearch07:ES中文分词插件(es-ik)安装部署

    一.ES中文分词插件(es-ik) 在中文数据检索场景中,为了提供更好的检索效果,需要在ES中集成中文分词器,因为ES默认是按照英文的分词规则进行分词的,基本上可以认为是单字分词,对中文分词效果不理想 ...

  7. JAVA使用es不分词_谈谈 Elasticsearch 分词和自定义分词

    初次接触 Elasticsearch 的同学经常会遇到分词相关的难题,比如如下这些场景: 1.为什么命名有包含搜索关键词的文档,但结果里面就没有相关文档呢? 2.我存进去的文档到底被分成哪些词(ter ...

  8. php es 短语精确搜索,ES中文分词器之精确短语匹配(解决了match_phrase匹配不全的问题)...

    分词器选择 调研了几种分词器,例如IK分词器,ansj分词器,mmseg分词器,发现IK的分词效果最好.举个例子: 词:<>哈撒多撒ئۇيغۇر تىلى王者荣耀sdsd@4342啊啊啊 ...

  9. es ik分词热更新MySQL,ElasticSearch(25)- 改IK分词器源码来基于mysql热更新词库

    代码地址 已经修改过的支持定期从数据库中提取新词库,来实现热更新.代码: https://github.com/csy512889371/learndemo/tree/master/elasticse ...

最新文章

  1. python判断哪个数最小_怎么用python比较三个数大小
  2. matlab中的containers.Map()
  3. SQL 2005启用组件Ad Hoc Distributed Queries
  4. go语言api源码中文版_Go语言学习——sync.map源码剖析
  5. 【bzoj3576】 Hnoi2014—江南乐
  6. 《asp.net夜话》一书视频ASP.NET夜话视频1-17章下载(ASP.NET夜话2009年5月9日更新)
  7. SpringBoot2.1.5 (9)--- GET 请求
  8. Android自定义百分数进度条
  9. mysql中的where 1 1_SQL语句中where 1=1和where 1=0的作用
  10. et中计算机的快捷键,ET软件快捷键
  11. C# QQ群管理机器人
  12. 基于MK802的应用开发和相关的工具
  13. 中山纪中集训Day5叒是测试
  14. ubuntu系统Firefox浏览器B站视频无法播放
  15. [异常] Encountered a duplicated sql alias [name] during auto-discovery of a native-sql query;
  16. 用相关函数法计算信号的延迟量
  17. 微信windows版_微信多开教程:Win、Mac、iOS、Android
  18. Oracle Golden Gate 系列七 -- 配置 GG Manager process
  19. Android源码-高质量开发库
  20. java文件长度_Java中的音频文件长度

热门文章

  1. 基于流式输入输出 使用Java借助GSON库 实现对大型asc文件的读入解析 并输出为JSON文件
  2. 如何用好示波器?资深工程师也会忽略这些细节……
  3. Joel Spolsky在耶鲁大学的演讲(下)
  4. 1189 SEARCH
  5. 正宇丨做事,要“借力”,不要“尽力”
  6. ***/BandwagonHost选择Linux操作系统的技巧
  7. charing animation
  8. 【蓝桥杯选拔赛真题34】Scratch数苹果 少儿编程scratch蓝桥杯选拔赛真题讲解
  9. 各种压缩格式介绍!(摘录2)
  10. 微信小程序之校园二手交易系统app毕业设计ssm