django model filter_django中探索如何提高查询数据效率

平时用到django时候挺多，最近有在搞数据库表分区的实践，所以顺便尝试在django如何提高查询效率，也就尝试在django中加入线程和协成的代码，看一下效果

相关工具

django + pycharm + mysql  + postman

创建django项目，具体不多说了

 1401  django-admin.py startproject db1405  cd db1413  python manage.py startapp cs

我获得文件夹树

接着我们在cs/models.py编辑一个People模型

from django.db import models
class People(models.Model):name = models.CharField(max_length=10, help_text='姓名')sex = models.SmallIntegerField(default=0, help_text="性别")age = models.SmallIntegerField(default=0, help_text='年龄')address = models.CharField(default='', max_length=128, help_text='地址')

然后我们生成对应表

 1416  python manage.py makemigrations1417  python manage.py migrate cs

接下来我们创建了600w的假数据，

在/cs/management/commands/create_people.py 创建一个django的脚本

from django.core.management.base import BaseCommand
from cs.models import People
import randomclass Command(BaseCommand):""""""def handle(self, *args, **options):spe = 500objs = list()for i in range(6000000):age = random.randint(1, 70)sex = random.randint(0, 1)objs.append(People(name=str(i), age=age, sex=sex))if i % 500 != 0:continuePeople.objects.bulk_create(objs)objs = list()print(i)

然后运行，生成数据

python manage.py create_people

这样我们600w的测试数据就创建完成了

from django.shortcuts import render
from django.views import View
from django.http.response import JsonResponse
from cs.models import People
import gevent
from gevent.pool import Pool
import asyncio
from concurrent.futures import ThreadPoolExecutor, as_completed
from django.db.transaction import atomic# Create your views here.
import timeclass DbViews(View):def get_count(self, age):return age, People.objects.filter(age=age).count()async def gets(self, age):count = People.objects.filter(age=age).countreturn count()async def get_async_count(self, age):print("***age***", age, time.time())age_count = await self.gets(age)print("***age***", age, time.time())return age, age_countdef gevent_pool(self, ages):ages_detail = dict()pool = Pool()ret = list()for age in ages:t = pool.spawn(self.get_count, age)ret.append(t)gevent.joinall(ret)for t in ret:age, age_count = t.valueages_detail.update({age: age_count})return ages_detaildef cur_get(self, ages):ages_detail = dict()for age in ages:age_count = People.objects.filter(age=age).count()ages_detail.update({age: age_count})return ages_detaildef thead_pool(self, ages):ages_detail = dict()thead_poll = ThreadPoolExecutor(max_workers=3)ret = list()for age in ages:t = thead_poll.submit(self.get_count, age)ret.append(t)for t in as_completed(ret):age, age_count = t.result()ages_detail.update({age: age_count})return ages_detaildef async_get(self, ages):ages_detail = dict()loop = asyncio.new_event_loop()asyncio.set_event_loop(loop)ret = list()for age in ages:t = asyncio.ensure_future(self.get_async_count(age))ret.append(t)done, pending = loop.run_until_complete(asyncio.wait(ret))for t in done:age, age_count = t.result()ages_detail.update({age: age_count})return ages_detaildef hclock(self):people = People.objects.select_for_update().filter(id__in=[3,4])with atomic():for i in people:time.sleep(10)print("*******",i.id)i.age += 1i.save()time.sleep(10)def get(self, request):ages = [10, 67, 29]ages_detail = self.hclock()ages_detail = dict()return JsonResponse(ages_detail)

这里是全部view代码，下面我们会一个说明

然后我们在/db/urls.py配置一个接口路径people

from django.conf.urls import url
from django.contrib import admin
from cs.views import DbViewsurlpatterns = [url('people/', DbViews.as_view(), name='cs')
]

启动web服务

我们假定我们的需求，我们需要获取10, 67, 29岁三种年龄的人数，我们先使用查询三次数据库的正常方式

    def get_count(self, age):return age, People.objects.filter(age=age).count()def cur_get(self, ages):ages_detail = dict()for age in ages:age_count = People.objects.filter(age=age).count()ages_detail.update({age: age_count})return ages_detaildef get(self, request):ages = [10, 67, 29]ages_detail = self.cur_get(ages)

我们得到了

消耗了2419ms

接下来我们使用gevent 方式

    def gevent_pool(self, ages):ages_detail = dict()pool = Pool()ret = list()for age in ages:t = pool.spawn(self.get_count, age)ret.append(t)gevent.joinall(ret)for t in ret:age, age_count = t.valueages_detail.update({age: age_count})return ages_detaildef get(self, request):ages = [10, 67, 29]ages_detail = self.gevent_pool(ages)return JsonResponse(ages_detail)

我们得到了

很明显的请求的时间没有多大差别，这可能models的count不支持的协成操作，在搞过gevent下载，可以使用monkey更改一下底层的一些东西

于是加入了

from gevent import monkey
monkey.patch_all()

但是出现了错误

During handling of the above exception, another exception occurred:Traceback (most recent call last):File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/utils/autoreload.py", line 227, in wrapperfn(*args, **kwargs)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/management/commands/runserver.py", line 125, in inner_runself.check(display_num_errors=True)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/management/base.py", line 359, in checkinclude_deployment_checks=include_deployment_checks,File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/management/base.py", line 346, in _run_checksreturn checks.run_checks(**kwargs)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/checks/registry.py", line 81, in run_checksnew_errors = check(app_configs=app_configs)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/checks/urls.py", line 16, in check_url_configreturn check_resolver(resolver)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/core/checks/urls.py", line 26, in check_resolverreturn check_method()File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/urls/resolvers.py", line 254, in checkfor pattern in self.url_patterns:File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/utils/functional.py", line 35, in __get__res = instance.__dict__[self.name] = self.func(instance)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/urls/resolvers.py", line 405, in url_patternspatterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/utils/functional.py", line 35, in __get__res = instance.__dict__[self.name] = self.func(instance)File "/Users/lixiang/projects/py3/lib/python3.6/site-packages/django/urls/resolvers.py", line 398, in urlconf_modulereturn import_module(self.urlconf_name)File "/Users/lixiang/projects/py3/lib/python3.6/importlib/__init__.py", line 126, in import_modulereturn _bootstrap._gcd_import(name[level:], package, level)File "<frozen importlib._bootstrap>", line 994, in _gcd_importFile "<frozen importlib._bootstrap>", line 971, in _find_and_loadFile "<frozen importlib._bootstrap>", line 152, in __exit__File "<frozen importlib._bootstrap>", line 107, in release
RuntimeError: cannot release un-acquired lock

看来django不让这样更改，我们有尝试了async这种方式

     async def get_async_count(self, age):print("***age***", age, time.time())age_count = await self.gets(age)print("***age***", age, time.time())def async_get(self, ages):ages_detail = dict()loop = asyncio.new_event_loop()asyncio.set_event_loop(loop)ret = list()for age in ages:t = asyncio.ensure_future(self.get_async_count(age))ret.append(t)done, pending = loop.run_until_complete(asyncio.wait(ret))for t in done:age, age_count = t.result()ages_detail.update({age: age_count})return ages_detaildef get(self, request):ages = [10, 67, 29]ages_detail = self.async_get(ages)return JsonResponse(ages_detail)

得到结果

还是没有什么差异，看来协成没有生效，于是我们在调用协程时候打印了一下

***age*** 10 1560943042.4748821
***age*** 10 1560943043.3534641
***age*** 67 1560943043.3534899
***age*** 67 1560943044.1488252
***age*** 29 1560943044.148851
INFO 2019-06-19 11:17:24,987 basehttp 3413 123145324937216 "GET /people/ HTTP/1.1" 200 39 ('GET /people/ HTTP/1.1', '200', '39')
***age*** 29 1560943044.9875998

很明显执行的时候是阻塞了，看来orm的方法执行时无法使用协成

现在我们使用一下线程

 def thead_pool(self, ages):ages_detail = dict()thead_poll = ThreadPoolExecutor(max_workers=3)ret = list()for age in ages:t = thead_poll.submit(self.get_count, age)ret.append(t)for t in as_completed(ret):age, age_count = t.result()ages_detail.update({age: age_count})return ages_detail
def get(self, request):ages = [10, 67, 29]ages_detail = self.thead_pool(ages)return JsonResponse(ages_detail)

得到结果

这里响应时间是857ms，比以上的大约2500ms方法快了大约1600ms，看来在django中对于多个查询来说，使用线程池是比较有效的提高访问效率的方式

我们对于600w的数据也可以使用分区提高效率，我们先把原来数据表cs_people删除，按age分区在重新导入数据

drop table cs_people
create table cs_people(`id` INT AUTO_INCREMENT,`name` varchar(10),`sex` smallint,`age` smallint,`address` varchar(128),PRIMARY KEY (`id`, `age`))PARTITION by range COLUMNS(age)(partition p0 values less than(10),partition p1 values less than(20),partition p2 values less than(30),partition p3 values less than(40),partition p4 values less than(50),partition p6 values less than(60),partition p7 values less than MAXVALUE);

我们可以查看分区情况：

select PARTITION_NAME, TABLE_ROWS from INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='cs_people';

我们还可以查看索引情况

show index from cs_people

我们在生成数据的脚本中添加了批量创建的脚本，使用了django自动bulk_create所以创建很快

其实在创建分区的时候我们已经将age设置为索引了，一会我们在说只索引的情况

我们看一下各种方式的结果，因为协程方式没有生效，我们就不讨论了

正常方式，串行三次请求，时间524ms（这个地方忘记保存截图了）

线程池请求，时间215ms

很明显这两种方式有很大的提升，这是因为mysql将按照不同age存在到了不到分区表里，我们在查询不同age时， mysql可以智能的到不同的小分区中查找，一下子将查询集缩小了数倍，这里有索引的一些功劳，我们接下来在看看只有索引的情况下，效果是怎么样的

我们删除分区表穿件索引表，这里和分区创建表的时候就不在使用migrate的了

创建索引

查看索引

正常方式 81ms

线程池，84ms

结果发现对于600w的数据，分区会有些额外开销，造成了访问时间多余只构建索引的情况，而且正常和线程池这两种方式时间大致相同

但是后来我增加了只构建索引这种情况的数据量，到达了1200w

线程池： 73ms

正常， 111ms

这样看来相对数据量大情况还是线程池效率比较高

对于小数据量来说，只创建索引就可以解决
如果数据量稍微有点大的，可以使用多线程加索引情况
而理论上使用分区情况应该是数据量比较大的情况，具体应该会与实际数据有关系

django model filter_django中探索如何提高查询数据效率相关推荐

千万级游标_在一个千万级的数据库查寻中，如何提高查询效率
在一个千万级的数据库查寻中,如何提高查询效率? 1)数据库设计方面: a.对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引. b.应尽量避免在 w ...
java怎么查询千万数据,从java方面，在一个千万级的数据库查寻中，如何提高查询效率？...
从java方面,在一个千万级的数据库查寻中,如何提高查询效率? 更多相关问题请分析下面的歌曲<沂蒙山我的娘亲亲>片段中采用了那种长音或休止处的处理方法faea48d2d30c3b221e ...
django model filter_Django开发常用方法及面试题
Django开发常用方法及面试题目录 1. 对Django的认识? 2. Django .Flask.Tornado的对比 3. 什么是wsgi,uwsgi,uWSGI? 4. django请求的生 ...
django model filter_Django分表的两个方案
由来知乎上的一个问题:Django 分表怎么实现? 这个问题戳到了Django ORM的痛点,对于多数据库/分库的问题,Django提供了很好的支持,通过using和db router可以很好的完 ...
11. Django 2.1.7 MVT模型示例 - 查询数据，返回渲染模板数据
参考文献官方文档上一篇文章讲述了Django 2.1.7 创建应用模板,基本已经理解了Django 2.1.7 框架下如何调用模板,并且渲染数据. 本篇章继续往数据库添加数据,然后查询数据,在模板 ...
Django 2.1.7 MVT模型示例 - 查询数据，返回渲染模板数据
参考文献官方文档上一篇文章讲述了Django 2.1.7 创建应用模板,基本已经理解了Django 2.1.7 框架下如何调用模板,并且渲染数据. 本篇章继续往数据库添加数据,然后查询数据,在模板 ...
WPF中嵌套charts图表查询数据
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 开发工具与关键技术:WPF .Charts 撰写日期:2020年09月16 ...
上海瀚示医药拣货标签在医药电商仓库改造中的解决方案—— 提高拣货效率
一.产品介绍项目核心产品为自主知识产权的智能电子货位标签系统(包含硬件产品.软件系统及增值服务).系一组安装在货架储位等载体上的电子设备,包含显示屏和信号灯按钮.此标签可彩屏显示.信号灯按钮提示等操 ...
jdbc中递归树封装查询数据
jdbc中递归树封装查询一.递归树封装多次查询数据库驱动包.jar文件: ①数据库数据准备(使用IDEA(64位)软件和phpstudy软件,即小皮系统(64位)) ②实体:为了使用通用 Dao ...

django model filter_django中探索如何提高查询数据效率

django model filter_django中探索如何提高查询数据效率相关推荐

最新文章

热门文章