原文地址:http://www.tocker.ca/2013/10/24/improving-the-performance-of-large-tables-in-mysql.html

Today I wanted to take a look at improving the performance of tables that cause performance problems based largely on their size. Some of this advice also applies to databases that are large in-aggregate over many tables, but I always find the individually large table a special-case that is problematic.

What you will normally find is that the speed that the table can be modified will trend down as the size increases. Here is what I am going to call the typical B+Tree index performance over time:

So we should expect degradation of performance due to the structure of the index, but there are actually some ways that we can try and stretch out the curve, and not degrade as quickly.

Ten potential ways to reduce large table impact:

  1. Make sure to use InnoDB instead of MyISAM. MyISAM can be faster at inserts to the end of a table, but it has both table locking (limiting updates and deletes) and uses a single lock to protect the key buffer when loading data to/from disk, resulting in contention. It also does not have the change buffering feature described just below.

  2. InnoDB has change buffering (previously called the insert buffer), which is a feature to delay building secondary indexes that are not unique, and merge writes. It's further described by Facebook here. It's not shown in the graph above, but it can boost insert performance by quite a lot, and it's enabled by default. It was greatly improved in MySQL 5.5, so it is time to upgrade if you haven't.

  3. Partitioning may reduce the size of indexes, effectively reducing the table
    into many smaller tables. It also reduces internal index->lockcontention, something that has been greatly improved in the MySQL 5.7.2 DMR.

  4. Use innodb page compression. For some workloads (particularly those with lots of char/varchar/text data types) compression will allow the data to be more compact, stretching out that performance curve for longer. It may also allow you to more easily justify SSDs which are typically smaller in capacity. InnoDB page compression was improved a lot in MySQL 5.6, courtesy of Facebook providing a series of patches.

  5. Sort and bulk load data into tables. Inserting in order will result in fewer page splits (which will perform worse on tables not in memory), and the bulk loading is not specifically related to the table size, but it will help reduce redo log pressure.

  6. Remove any unnecessary indexes on the table, paying particular attention to UNIQUE indexes as these disable change buffering. Don't use a UNIQUE index if you have no reason for that constraint; prefer a regular INDEX.

  7. Related to the points 5 & 6, the type of primary key also matters. It is much better to use either an INT or BIGINT datatype than say a GUID, which will have a curve that degrades much faster. Having no PRIMARY KEY will also affect performance negatively.

  8. If bulk loading a fresh table, delay creating any indexes besides the PRIMARY KEY. If you create them once all data is loaded, then InnoDB is able to apply a pre-sort and bulk load process which is both faster and results in typically more compact indexes. This optimization became true in MySQL 5.5.

  9. More memory can actually help here too. I frequently see people under spec memory on new database servers compared to what it actually costs these days. Simple advice: If SHOW ENGINE INNODB STATUSshows any reads/s under BUFFER POOL AND MEMORY and the number of Free buffers (also under BUFFER POOL AND MEMORY) is zero, you could benefit from more (assuming you have sized innodb_buffer_pool_sizecorrectly on your server. See here.)

  10. As well as memory, SSDs can help too. Much of the performance drop shown on the curve can be attributed to additional IO which is created as the table gets bigger. While a hard drive can do 200 operations per second (IOPS), a typical SSD will do 20K+

转载于:https://www.cnblogs.com/davidwang456/p/4930538.html

Ten ways to improve the performance of large tables in MySQL--转载相关推荐

  1. 深度模型压缩论文(03)- Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self

    文章目录 1.摘要和背景 1.1 摘要 1.2 背景 2.方法和贡献 2.1 方法 2.1.1 训练过程 2.1.2 loss介绍 2.2 贡献 3.实验和结果 3.1 实验 3.2 结果 4.总结和 ...

  2. The Transform API is removed to improve build performance. Projects that use

    The Transform API is removed to improve build performance. Projects that use 降低gradle-plugin版本就可以了

  3. ways to improve your presentation by your own

    record your voice and listen to it

  4. (6)继承与面向对象设计- Effective C++改善程序与设计的55个具体做法(Effective C++: 55 Specific Ways to Improve Your Programs)

    文章目录 32. 确定你的public继承塑模出is-a关系(Make sure public inheritance models "is-a") 33. 避免遮挡继承而来的名称 ...

  5. LMAX Disruptor – High Performance, Low Latency and Simple Too 转载

    原文地址:http://www.symphonious.net/2011/07/11/lmax-disruptor-high-performance-low-latency-and-simple-to ...

  6. Packet for query is too large(1767212 1048576)mysql在存储图片时提示图片过大

    原网址:http://blog.csdn.net/bigbird2012/article/details/6304417 错误现象:Packet for query is too large(1767 ...

  7. 《高效的项目和团队》

    Productive Projects and Teams是一本好书. 许多其中许多关于管理和沟通的精辟言论让我大有相见很晚之感.其实不仅是软件的开发项目,任何项目,甚至任何行业的管理,都首先是对人的 ...

  8. fitbit手表中文说明书_使用机器学习预测Fitbit睡眠分数

    fitbit手表中文说明书 In Part 1 of this article I explained how we can obtain sleep data from Fitbit, load i ...

  9. kaggle比赛数据_表格数据二进制分类:来自5个Kaggle比赛的所有技巧和窍门

    kaggle比赛数据 This article was originally written by Shahul ES and posted on the Neptune blog. 本文最初由 Sh ...

最新文章

  1. Node.js v0.10版本发布
  2. linux驱动之i2c学习
  3. 文件服务器的配置与管理(3) 共享文件夹的创建与使用
  4. 即时通讯音视频开发(五):认识主流视频编码技术H.264
  5. ASP.NET入门五步详解
  6. 程序猿之歌 PHP,1024丨腾讯第一首程序员之歌【Code代码】
  7. 《算法竞赛入门经典》计算组合数问题
  8. 【转载】C#中List集合使用Contains方法判断是否包含某个对象
  9. Poj 1556 The Doors 计算几何+最短路
  10. SQL函数获取一年中每个月的天数
  11. codeforces1554 E. You(思维+数学+转化)
  12. 书写「简历」时,需要规避的错误
  13. Google 投资了京东
  14. 大数据hadoop培训总结
  15. 小程序容器化:基于uni-app的Android小程序开发
  16. 0302、DNS服务器、多区域的DNS服务器、DNS主从架构、DNS主从数据同步、特殊解析、缓存DNS
  17. STM32F1串口通信控制LED和MG90S
  18. 【20200422】编译原理课程课业打卡十七之求解文法FirstVTLastVT构造文法算符优先关系表
  19. python3的numpy包中的numpy.logspace解析
  20. 软件工程——软件设计总结

热门文章

  1. 最好的oracle笔记,Oracle学习笔记(一)
  2. linux dup用法,Unix_Linux
  3. flutter 局部状态和全局状态区别_Flutter状态管理
  4. 计算机应用基础操作题教学考试,电大教学全国计算机应用基础考试网考内容全部操作题.doc...
  5. C++中不允许重复定义全局变量
  6. linux的wc是什么命令,linux中的wc命令
  7. linux 移动目录树到子目录中,Linux系统管理员工具包: 移动Linux/UNIX目录
  8. c++ 函数的指针调用
  9. 54. Leetcode 113. 路径总和 II (二叉树-二叉树路径和)
  10. tensorflow就该这么学--6(多层神经网络)