原文地址:https://dev.mysql.com/doc/refman/5.1/en/partitioning-hash.html

HASH Partitioning

[+/-]

18.2.3.1 LINEAR HASH Partitioning

Partitioning by HASH is used primarily to ensure an even distribution of data among a predetermined number of partitions. With range or list partitioning, you must specify explicitly into which partition a given column value or set of column values is to be stored; with hash partitioning, MySQL takes care of this for you, and you need only specify a column value or expression based on a column value to be hashed and the number of partitions into which the partitioned table is to be divided.

To partition a table using HASH partitioning, it is necessary to append to the CREATE TABLE statement aPARTITION BY HASH (expr) clause, where expr is an expression that returns an integer. This can simply be the name of a column whose type is one of MySQL's integer types. In addition, you most likely want to follow this with PARTITIONS num, where num is a positive integer representing the number of partitions into which the table is to be divided.

Note

For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this this table must be part of every unique key, including the primary key. See Section 18.5.1, “Partitioning Keys, Primary Keys, and Unique Keys”, for more information.

The following statement creates a table that uses hashing on the store_id column and is divided into 4 partitions:

CREATE TABLE employees (id INT NOT NULL,fname VARCHAR(30),lname VARCHAR(30),hired DATE NOT NULL DEFAULT '1970-01-01',separated DATE NOT NULL DEFAULT '9999-12-31',job_code INT,store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;

If you do not include a PARTITIONS clause, the number of partitions defaults to 1.

Using the PARTITIONS keyword without a number following it results in a syntax error.

You can also use an SQL expression that returns an integer for expr. For instance, you might want to partition based on the year in which an employee was hired. This can be done as shown here:

CREATE TABLE employees (id INT NOT NULL,fname VARCHAR(30),lname VARCHAR(30),hired DATE NOT NULL DEFAULT '1970-01-01',separated DATE NOT NULL DEFAULT '9999-12-31',job_code INT,store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;

expr must return a nonconstant, nonrandom integer value (in other words, it should be varying but deterministic), and must not contain any prohibited constructs as described in Section 18.5, “Restrictions and Limitations on Partitioning”. You should also keep in mind that this expression is evaluated each time a row is inserted or updated (or possibly deleted); this means that very complex expressions may give rise to performance issues, particularly when performing operations (such as batch inserts) that affect a great many rows at one time.

The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for “pruning” on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.

For example, where date_col is a column of type DATE, then the expression TO_DAYS(date_col) is said to vary directly with the value of date_col, because for every change in the value of date_col, the value of the expression changes in a consistent manner. The variance of the expression YEAR(date_col) with respect todate_col is not quite as direct as that of TO_DAYS(date_col), because not every possible change in date_colproduces an equivalent change in YEAR(date_col). Even so, YEAR(date_col) is a good candidate for a hashing function, because it varies directly with a portion of date_col and there is no possible change indate_col that produces a disproportionate change in YEAR(date_col).

By way of contrast, suppose that you have a column named int_col whose type is INT. Now consider the expression POW(5-int_col,3) + 6. This would be a poor choice for a hashing function because a change in the value of int_col is not guaranteed to produce a proportional change in the value of the expression. Changing the value of int_col by a given amount can produce by widely different changes in the value of the expression. For example, changing int_col from 5 to 6 produces a change of -1 in the value of the expression, but changing the value of int_col from 6 to 7 produces a change of -7 in the expression value.

In other words, the more closely the graph of the column value versus the value of the expression follows a straight line as traced by the equation y=cx where c is some nonzero constant, the better the expression is suited to hashing. This has to do with the fact that the more nonlinear an expression is, the more uneven the distribution of data among the partitions it tends to produce.

In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.

When PARTITION BY HASH is used, MySQL determines which partition of num partitions to use based on the modulus of the result of the user function. In other words, for an expression expr, the partition in which the record is stored is partition number N, where N = MOD(exprnum). Suppose that table t1 is defined as follows, so that it has 4 partitions:

CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)PARTITION BY HASH( YEAR(col3) )PARTITIONS 4;

If you insert a record into t1 whose col3 value is '2005-09-15', then the partition in which it is stored is determined as follows:

MOD(YEAR('2005-09-01'),4)
=  MOD(2005,4)
=  1

MySQL 5.1 also supports a variant of HASH partitioning known as linear hashing which employs a more complex algorithm for determining the placement of new rows inserted into the partitioned table. See Section 18.2.3.1, “LINEAR HASH Partitioning”, for a description of this algorithm.

The user function is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.

Note

If a table to be partitioned has a UNIQUE key, then any columns supplied as arguments to the HASH user function or to the KEY's column_list must be part of that key.

转载于:https://www.cnblogs.com/davidwang456/p/4668332.html

HASH Partitioning--转载相关推荐

  1. Redis进阶-分布式存储 Sequential partitioning Hash partitioning

    文章目录 分布式存储 顺序分区 Sequential partitioning 哈希分区 Hash partitioning 方案总览 节点取余分区 Hashing 一致性哈希分区 Consisten ...

  2. Mysql 分区(range,list,hash)转载

    MySQL支持RANGE,LIST,HASH和KEY四种分区.其中,每个分区又都有一种特殊的类型.对于RANGE分区,有RANGE COLUMNS分区.对于LIST分区,有LIST COLUMNS分区 ...

  3. php 一致性hash,【转载】memcache分布式 [一致性hash算法] 的php实现

    最近在看一些分布式方面的文章,所以就用php实现一致性hash来练练手,以前一般用的是最原始的hash取模做分布式,当生产过程中添加或删除一台memcache都会造成数据的全部失效,一致性hash就是 ...

  4. oracle 分区使用情况,Oracle Hash分区的使用总结

    近期项目需要用到分区表,但是分区键值有无法确定,因此只能使用hash分区(range.list分区以前常用,比hash分区简单),查询了文档,发现上面说的和实际使用时有点差距,就专门做实验验证下. 官 ...

  5. 如何获取Debug Android Hash Key

    在接入FaceBook第三方登录的时候,需要获取Android Hash Key. Android Hash Key即密钥散列有两种,一种是开发秘钥散列,一种是发布秘钥散列.这里主要介绍如何获取开发秘 ...

  6. Hash 分布均衡算法

    1.移位实现 public static int GetIndex(string str, int count){int hash = str.Aggregate(23, (current, c) = ...

  7. 到底什么是hash partition?

    最近在flink的dataset api中看到了hash-partition的概念. 下面这个解释[1]比较清晰: Techopedia explains Hash Partitioning Hash ...

  8. 用链表和数组实现HASH表,几种碰撞冲突解决方法

    Hash算法中要解决一个碰撞冲突的办法,后文中描述了几种解决方法.下面代码中用的是链式地址法,就是用链表和数组实现HASH表. he/*hash table max size*/ #define HA ...

  9. Partitioning Strategies

    001.三种基本分区方式:Range.Hash.List. 002.Single-Level Partitioning     表以三种分区方式之一进行分区,使用一列或多列作为分区键. Range P ...

最新文章

  1. PCL中点云的超体素(SuperVoxel)
  2. PHP文件上传和文件操作案例
  3. XML CDATA概述
  4. android获取时区时间格式,考虑时区的日期/时间转换-Android
  5. visio图标_弱电间机柜原型图整理,可编辑!(Excel,visio,CAD)
  6. 作者:朱扬勇,复旦大学计算机科学技术学院教授、学术委员会主任,上海市数据科学重点实验室主任。...
  7. TensorFlow第五步:返回起点、深挖坑,解刨一个麻雀。
  8. Airflow 中文文档:用Celery扩大规模
  9. sqlserver 参数化查询 允许为null_关于SQL Server的insert执行的秘密(上)一个最简单的insert分析...
  10. 蓝桥杯 BASIC-25 基础练习 回形取数
  11. MATLAB数值计算笔记
  12. BLE芯片DA145XX系列:GPIO特殊配置
  13. LIFELONG LEARNING WITH DYNAMICALLY EXPANDABLE NETWORKS论文阅读+代码解析
  14. php能做指纹信息的,指纹在生活中的用途有哪些
  15. 我的世界python写游戏_用python写游戏之 Give it up
  16. qt 5.13.2 在湖南麒麟下的运行报错解决方案
  17. 解散群通知怎么写_家人微信群想解散通知怎么写
  18. 使用EditPlus技巧
  19. 原来贝叶斯统计分析这么简单?这个技巧了解一下
  20. eclipse写python怎么样_eclipse python 使用教程(怎么用eclipse写python)

热门文章

  1. uniapp商城_【程序源代码】商城小程序
  2. python struct pack解析_python struct pack
  3. sis防屏蔽程序_弱电工程屏蔽机房设计方案
  4. 电脑服务器不稳定怎么办,网速不稳定怎么办? 网速不稳定的原因与解决办法-电脑教程...
  5. oa提醒模块要素_OA办公系统的选型有哪些要素?OA系统如何给企业带来巨大价值?...
  6. 计算机设计大赛海洋世界,全国大学生海洋文化设计大赛第二届获奖作品(三)...
  7. 智能计米器jk76怎么安装_智能电视怎么安装软件?详细教程一学就会
  8. 请简述计算机软件系统与硬件系统的关系,电脑硬件与软件的关系是什么?
  9. 粘包的原因分析及解决
  10. 双色球随机选号器界面设计及功能实现