ElasticSearch python基本操作

　官方文档：https://elasticsearch-py.readthedocs.io/en/master/

　　1、介绍

　　　　python提供了操作ElasticSearch 接口，因此要用python来操作ElasticSearch，首先要安装python的ElasticSearch包，用命令 pip install elasticsearch安装或下载安装：https://pypi.python.org/pypi/elasticsearch/5.4.0

　　2、创建索引

　　　　假如创建索引名称为ott，类型为ott_type的索引，该索引中有五个字段：

　　　　title：存储中文标题，

　　　　date：存储日期格式（2017-09-08），

　　　　keyword：存储中文关键字，

　　　　source：存储中文来源，

　　　　link：存储链接，

　　　　创建映射：

　　3、索引数据

　　　　批量索引

　　　　利用bulk批量索引数据

　　4、查询索引

　　5、删除数据

　　6、完整代码


#coding:utf8
import os
import time
from os import walk
import CSVOP
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulkclass ElasticObj:def __init__(self, index_name,index_type,ip ="127.0.0.1"):''':param index_name: 索引名称:param index_type: 索引类型'''self.index_name =index_nameself.index_type = index_type# 无用户名密码状态#self.es = Elasticsearch([ip])#用户名密码状态self.es = Elasticsearch([ip],http_auth=('elastic', 'password'),port=9200)def create_index(self,index_name="ott",index_type="ott_type"):'''创建索引,创建索引名称为ott，类型为ott_type的索引:param ex: Elasticsearch对象:return:'''#创建映射_index_mappings = {"mappings": {self.index_type: {"properties": {"title": {"type": "text","index": True,"analyzer": "ik_max_word","search_analyzer": "ik_max_word"},"date": {"type": "text","index": True},"keyword": {"type": "string","index": "not_analyzed"},"source": {"type": "string","index": "not_analyzed"},"link": {"type": "string","index": "not_analyzed"}}}}}if self.es.indices.exists(index=self.index_name) is not True:res = self.es.indices.create(index=self.index_name, body=_index_mappings)print resdef IndexData(self):es = Elasticsearch()csvdir = 'D:/work/ElasticSearch/exportExcels'filenamelist = []for (dirpath, dirnames, filenames) in walk(csvdir):filenamelist.extend(filenames)breaktotal = 0for file in filenamelist:csvfile = csvdir + '/' + fileself.Index_Data_FromCSV(csvfile,es)total += 1print totaltime.sleep(10)def Index_Data_FromCSV(self,csvfile):'''从CSV文件中读取数据，并存储到es中:param csvfile: csv文件，包括完整路径:return:'''list = CSVOP.ReadCSV(csvfile)index = 0doc = {}for item in list:if index > 1:#第一行是标题doc['title'] = item[0]doc['link'] = item[1]doc['date'] = item[2]doc['source'] = item[3]doc['keyword'] = item[4]res = self.es.index(index=self.index_name, doc_type=self.index_type, body=doc)print(res['created'])index += 1print indexdef Index_Data(self):'''数据存储到es:return:'''list = [{   "date": "2017-09-13","source": "慧聪网","link": "http://info.broadcast.hc360.com/2017/09/130859749974.shtml","keyword": "电视","title": "付费 电视 行业面临的转型和挑战"},{   "date": "2017-09-13","source": "中国文明网","link": "http://www.wenming.cn/xj_pd/yw/201709/t20170913_4421323.shtml","keyword": "电视","title": "电视 专题片《巡视利剑》广获好评：铁腕反腐凝聚党心民心"}]for item in list:res = self.es.index(index=self.index_name, doc_type=self.index_type, body=item)print(res['created'])def bulk_Index_Data(self):'''用bulk将批量数据存储到es:return:'''list = [{"date": "2017-09-13","source": "慧聪网","link": "http://info.broadcast.hc360.com/2017/09/130859749974.shtml","keyword": "电视","title": "付费 电视 行业面临的转型和挑战"},{"date": "2017-09-13","source": "中国文明网","link": "http://www.wenming.cn/xj_pd/yw/201709/t20170913_4421323.shtml","keyword": "电视","title": "电视 专题片《巡视利剑》广获好评：铁腕反腐凝聚党心民心"},{"date": "2017-09-13","source": "人民电视","link": "http://tv.people.com.cn/BIG5/n1/2017/0913/c67816-29533981.html","keyword": "电视","title": "中国第21批赴刚果（金）维和部隊启程--人民 电视 --人民网"},{"date": "2017-09-13","source": "站长之家","link": "http://www.chinaz.com/news/2017/0913/804263.shtml","keyword": "电视","title": "电视 盒子 哪个牌子好？ 吐血奉献三大选购秘笈"}]ACTIONS = []i = 1for line in list:action = {"_index": self.index_name,"_type": self.index_type,"_id": i, #_id 也可以默认生成，不赋值"_source": {"date": line['date'],"source": line['source'].decode('utf8'),"link": line['link'],"keyword": line['keyword'].decode('utf8'),"title": line['title'].decode('utf8')}}i += 1ACTIONS.append(action)# 批量处理success, _ = bulk(self.es, ACTIONS, index=self.index_name, raise_on_error=True)print('Performed %d actions' % success)def Delete_Index_Data(self,id):'''删除索引中的一条:param id::return:'''res = self.es.delete(index=self.index_name, doc_type=self.index_type, id=id)print resdef Get_Data_Id(self,id):res = self.es.get(index=self.index_name, doc_type=self.index_type,id=id)print(res['_source'])print '------------------------------------------------------------------'## # 输出查询到的结果for hit in res['hits']['hits']:# print hit['_source']print hit['_source']['date'],hit['_source']['source'],hit['_source']['link'],hit['_source']['keyword'],hit['_source']['title']def Get_Data_By_Body(self):# doc = {'query': {'match_all': {}}}doc = {"query": {"match": {"keyword": "电视"}}}_searched = self.es.search(index=self.index_name, doc_type=self.index_type, body=doc)for hit in _searched['hits']['hits']:# print hit['_source']print hit['_source']['date'], hit['_source']['source'], hit['_source']['link'], hit['_source']['keyword'], \hit['_source']['title']obj =ElasticObj("ott","ott_type",ip ="47.93.117.127")
# obj = ElasticObj("ott1", "ott_type1")# obj.create_index()
obj.Index_Data()
# obj.bulk_Index_Data()
# obj.IndexData()
# obj.Delete_Index_Data(1)
# csvfile = 'D:/work/ElasticSearch/exportExcels/2017-08-31_info.csv'
# obj.Index_Data_FromCSV(csvfile)
# obj.GetData(es)

转自：https://www.cnblogs.com/shaosks/p/7592229.html

ElasticSearch python基本操作相关推荐

ElasticSearch Python Client ReadTimeout
ElasticSearch Python Client ReadTimeout ElasticSearch Python Client API,Bulk操作时,当ElasticSearch服务端的性能 ...
Elasticsearch rest-high-level-client 基本操作
Elasticsearch rest-high-level-client 基本操作本篇主要讲解一下 rest-high-level-client 去操作 Elasticsearch , 虽然这个客户 ...
ElasticSearch索引基本操作POST PUT GET DELETE
ElasticSearch索引基本操作 Rest风格说明基本测试 1.PUT创建一个索引 2.指定类型 3.PUT创建索引以及指定字段的类型 4. GET获取索引信息 5.索引默认的指定类型 6. ...
Python—SJ—实验1—Python基本操作
Python-SJ-实验1-Python基本操作 2020.1.9日共五个部分 ①总体实验内容的文字介绍 ②实验的未作答版本 ③实验报告模板 ④实验作答版代码(不确保正确,仅供参考) ⑤实验报告(我 ...
python基本操作之字典
python基本操作之字典一.创建字典创建字典操作很简单,只需要记住使用大括号即可(python中的三种常用数据类型:列表 - [],元组 - (),字典 - {}) #创建一个空字典 dic = ...
Elasticsearch(三) Python 使用 elasticsearch 的基本操作
参考文章:https://cuiqingcai.com/6214.html 一. python 安装 elasticsearch标准库 1. pip install elasticsearch 2. ...
elasticsearch python API
yuanwen Elasticsearch启动 # 进入到elasticsearch的bin目录 cd /.../.../elasticsearch-x.x.x/bin # 启动elasticsear ...
搜索引擎（1）：ElasticSearch + python （理论+ 实践）
Elasticsearch 是一个开源的搜索引擎,建立在一个全文搜索引擎库 Apache Lucene™ 基础之上. Lucene 可能是目前存在的,不论开源还是私有的,拥有最先进,高性能和全功能搜索 ...
python删除文件夹无法访问_零基础小白必看：python基本操作-文件、目录及路径
使用python的os模块,简单方便完成对文件夹.文件及路径的管理与访问操作. 1 前言在最近开发中,经常需要对文件进行读取.遍历.修改等操作,想要快速.简单的完成这些操作,我选择用 python ...