原文地址:https://www.codeproject.com/Articles/359654/important-database-designing-rules-which-I-fo

麻辣个?的,好像在哪儿看到过这篇文章的中文版的,世风日下,人心不古啊,各种抄袭啊。

原文的锚点不知道是太长了,还是格式有问题,这里要改一下。

Table of Contents

  • Introduction
  • Rule 1: What is the nature of the application (OLTP or OLAP)?
  • Rule 2: Break your data in to logical pieces, make life simpler
  • Rule 3: Do not get overdosed with rule 2
  • Rule 4: Treat duplicate non-uniform data as your biggest enemy
  • Rule 5: Watch for data separated by separators
  • Rule 6: Watch for partial dependencies
  • Rule 7: Choose derived columns preciously
  • Rule 8: Do not be hard on avoiding redundancy, if performance is the key
  • Rule 9: Multidimensional data is a different beast altogether
  • Rule 10: Centralize name value table design
  • Rule 11: For unlimited hierarchical data self-reference PK and FK

Courtesy: Image from Motion pictures

Introduction

Before you start reading this article let me confirm to you I am not a guru in database designing. The below 11 points are what I have learnt via projects, my own experiences, and my own reading. I personally think it has helped me a lot when it comes to DB designing. Any criticism is welcome.

The reason I am writing a full blown article is, when developers design a database they tend to follow the three normal forms like a silver bullet. They tend to think normalization is the only way of designing. Due this mind set they sometimes hit road blocks as the project moves ahead.

If you are new to normalization, then click and see 3 normal forms in action which explains all the three normal forms step by step.

Said and done normalization rules are important guidelines but taking them as a mark on stone is calling for trouble. Below are my own 11 rules which I remember on the top of my head while doing DB design.

Rule 1: What is the nature of the application (OLTP or OLAP)?

When you start your database design the first thing to analyze is the nature of the application you are designing for, is it Transactional or Analytical. You will find many developers by default applying normalization rules without thinking about the nature of the application and then later getting into performance and customization issues. As said, there are two kinds of applications: transaction based and analytical based, let’s understand what these types are.

Transactional: In this kind of application, your end user is more interested in CRUD, i.e., creating, reading, updating, and deleting records. The official name for such a kind of database is OLTP.

Analytical: In these kinds of applications your end user is more interested in analysis, reporting, forecasting, etc. These kinds of databases have a less number of inserts and updates. The main intention here is to fetch and analyze data as fast as possible. The official name for such a kind of database is OLAP.

In other words if you think inserts, updates, and deletes are more prominent then go for a normalized table design, else create a flat denormalized database structure.

Below is a simple diagram which shows how the names and address in the left hand side are a simple normalized table and by applying a denormalized structure how we have created a flat table structure.

Rule 2: Break your data into logical pieces, make life simpler

This rule is actually the first rule from 1st normal form. One of the signs of violation of this rule is if your queries are using too many string parsing functions like substring, charindex, etc., then probably this rule needs to be applied.

For instance you can see the below table which has student names; if you ever want to query student names having “Koirala” and not “Harisingh”, you can imagine what kind of a query you will end up with.

So the better approach would be to break this field into further logical pieces so that we can write clean and optimal queries.

Rule 3: Do not get overdosed with rule 2

Developers are cute creatures. If you tell them this is the way, they keep doing it; well, they overdo it leading to unwanted consequences. This also applies to rule 2 which we just talked above. When you think about decomposing, give a pause and ask yourself, is it needed? As said, the decomposition should be logical.

For instance, you can see the phone number field; it’s rare that you will operate on ISD codes of phone numbers separately (until your application demands it). So it would be a wise decision to just leave it as it can lead to more complications.

Rule 4: Treat duplicate non-uniform data as your biggest enemy

Focus and refactor duplicate data. My personal worry about duplicate data is not that it takes hard disk space, but the confusion it creates.

For instance, in the below diagram, you can see “5th Standard” and “Fifth standard” means the same. Now you can say the data has come into your system due to bad data entry or poor validation. If you ever want to derive a report, they would show them as different entities, which is very confusing from the end user point of view.

One of the solutions would be to move the data into a different master table altogether and refer them via foreign keys. You can see in the below figure how we have created a new master table called “Standards” and linked the same using a simple foreign key.

Rule 5: Watch for data separated by separators

The second rule of 1st normal form says avoid repeating groups. One of the examples of repeating groups is explained in the below diagram. If you see the syllabus field closely, in one field we have too much data stuffed. These kinds of fields are termed as “Repeating groups”. If we have to manipulate this data, the query would be complex and also I doubt about the performance of the queries.

These kinds of columns which have data stuffed with separators need special attention and a better approach would be to move those fields to a different table and link them with keys for better management.

So now let’s apply the second rule of 1st normal form: “Avoid repeating groups”. You can see in the above figure I have created a separate syllabus table and then made a many-to-many relationship with the subject table.

With this approach the syllabus field in the main table is no more repeating and has data separators.

Rule 6: Watch for partial dependencies

Watch for fields which depend partially on primary keys. For instance in the above table we can see the primary key is created on roll number and standard. Now watch the syllabus field closely. The syllabus field is associated with a standard and not with a student directly (roll number).

The syllabus is associated with the standard in which the student is studying and not directly with the student. So if tomorrow we want to update the syllabus we have to update it for each student, which is painstaking and not logical. It makes more sense to move these fields out and associate them with the Standard table.

You can see how we have moved the syllabus field and attached it to the Standards table.

This rule is nothing but the 2nd normal form: “All keys should depend on the full primary key and not partially”.

Rule 7: Choose derived columns preciously

If you are working on OLTP applications, getting rid of derived columns would be a good thought, unless there is some pressing reason for performance. In case of OLAP where we do a lot of summations, calculations, these kinds of fields are necessary to gain performance.

In the above figure you can see how the average field is dependent on the marks and subject. This is also one form of redundancy. So for such kinds of fields which are derived from other fields, give a thought: are they really necessary?

This rule is also termed as the 3rd normal form: “No column should depend on other non-primary key columns”. My personal thought is do not apply this rule blindly, see the situation; it’s not that redundant data is always bad. If the redundant data is calculative data, see the situation and then decide if you want to implement the 3rdnormal form.

Rule 8: Do not be hard on avoiding redundancy, if performance is the key

Do not make it a strict rule that you will always avoid redundancy. If there is a pressing need for performance think about de-normalization. In normalization, you need to make joins with many tables and in denormalization, the joins reduce and thus increase performance.

Rule 9: Multidimensional data is a different beast altogether

OLAP projects mostly deal with multidimensional data. For instance you can see the below figure, you would like to get sales per country, customer, and date. In simple words you are looking at sales figures which have three intersections of dimension data.

For such kinds of situations a dimension and fact design is a better approach. In simple words you can create a simple central sales fact table which has the sales amount field and it makes a connection with all dimension tables using a foreign key relationship.

Rule 10: Centralize name value table design

Many times I have come across name value tables. Name and value tables means it has key and some data associated with the key. For instance in the below figure you can see we have a currency table and a country table. If you watch the data closely they actually only have a key and value.

For such kinds of tables, creating a central table and differentiating the data by using a type field makes more sense.

Rule 11: For unlimited hierarchical data self-reference PK and FK

Many times we come across data with unlimited parent child hierarchy. For instance consider a multi-level marketing scenario where a sales person can have multiple sales people below them. For such scenarios, using a self-referencing primary key and foreign key will help to achieve the same.

This article is not meant to say that do not follow normal forms, instead do not follow them blindly, look at your project's nature and the type of data you are dealing with first.

Below is a video which explains the three normal forms step by step using a simple school table.

You can also visit my website for step by step videos on Design Patterns, UML, SharePoint 2010, .NET Fundamentals, VSTS, UML, SQL Server, MVC, and lots more.

转载于:https://www.cnblogs.com/tuhooo/p/8461375.html

数据库设计(三)11 important database designing rules which I follow相关推荐

  1. [数据库03]-约束(唯一性-主键-外键/存储引擎/事务/索引/视图/DBA命令/数据库设计三范式

    [数据库03]-约束(唯一性-主键-外键)/存储引擎/事务/索引/视图/DBA命令/数据库设计三范式 一.约束 1.1 唯一性约束(unique) 1.2 主键约束 1.3 外键约束 二.存储引擎 2 ...

  2. MySQL之数据库设计三范式

    目录 一.简介 第一范式 第二范式 第三范式 结语 学习计划: 一.简介   我们数据库表设计的时候需要尽可能的遵循三范式,具体是 第一范式(1NF): 强调的是列的原子性,即列不能够再分成其他几列, ...

  3. 【数据库】数据库设计三范式

    数据库设计的三大范式 为了建立冗余较小.结构合理的数据库,设计数据库时必须遵循一定的规则.在关系型数据库中这种规则就称为范式.范式是符合某一种设计要求的总结.要想设计一个结构合理的关系型数据库,必须满 ...

  4. MySQL面试题 数据库设计三范式

    第一范式 属性(字段)的原子性约束,要求属性具有原子性,不可再分割: 比如个人信息,个人信息不能作为一个字段,它可以再分为姓名.name.age等: 第二范式 记录的惟一性约束,要求记录有惟一标识,每 ...

  5. 深入浅出数据库设计三范式

    设计良好结构的数据库,可以有效减小数据冗余,减少增删改中出现的问题.深入理解数据库设计的三范式,对于设计"健壮的数据库"十分有必要.数据库三范式是设计数据库 时参考的准则,接下来我 ...

  6. 浅析数据库设计三范式

       在学习数据设计的时候,N种专业术语,看的头疼.但又不能不学,所以只好把它们整理整理出来,好让自己对它们有一个更深的理解.特别是对三范式(Normal Formal)的理解.     三范式指的是 ...

  7. 数据库设计三范式详细介绍--数据库设计规范之数据库设计三范式

    为什么需要数据库设计 1. 我们在设计数据表的时候要考虑很多问题问题,比如: 用户都需要什么数据?需要在数据表中保存哪些数据? 如果保证数据表中数据的正确性,当插入.删除.更新的时候该进行怎么样的约束 ...

  8. MySQL——数据库设计三范式

    0.数据库设计范式 设计范式是数据库表的设计依据,如何进行数据库表的设计. 设计数据库表时按照三范式进行,可以避免表中数据的冗余,空间的浪费. 1.第一范式 要求任何一张表必须有主键,每一个字段原子性 ...

  9. 数据库设计三范式(3NF)

    问:当时你数据库是如何设计的? 答:当时是按照三范式规范设计的: 第一范式: 1:数据库的原子性,即保证数据库表的每一列都不可分割的 第二范式: 1:原子性,即保证数据库表的每一列都不可分割 2:表中 ...

最新文章

  1. SPI初始化C语言编程,SD卡spi模式读写,初始化和复位都成功了
  2. Errors occurred during the build. Errors running builder 'Validation' on pro
  3. Spring boot 定制banner
  4. oracle基础学习---------1
  5. 【Python3网络爬虫开发实战】3.1.2-处理异常
  6. C++ 获取分辨率 获取桌面分辨率
  7. h5的table表格边框线问题解决方案
  8. wordpress 百度主动推送 PHP,WordPress 百度自动推送插件
  9. 2013.11.18周例会小结
  10. 天正多条线段长度lisp下载_AutoCAD求多条线的长度
  11. 使用 DiskMaker X 轻松制作 Yosemite 安装 U 盘(引)
  12. 价值工程杂志价值工程杂志社价值工程编辑部2022年第23期目录
  13. KGB知识图谱能够为公司分析上市影响因素
  14. 说明书丨Epigentek EpiNext 高灵敏免疫共沉淀测序试剂盒
  15. 人、机客户服务质量 - 实时透视分析
  16. CocosCreator知识库amp;amp;lt;二amp;amp;gt;关于TiledMap的系统学习教程(阶段性更新)
  17. 2019 初入IT十年(下)---- 视线所及只剩生活
  18. -bash: lsb_release: 未找到命令
  19. 多可系统如何增加新用户到工作组
  20. Android 存储选择

热门文章

  1. css动画唯美背景,小码哥-利用CSS3渐变实现唯美背景图
  2. 【MMC驱动开发】——EMMC协议速览
  3. 垃圾分类、EfficientNet模型、数据增强(ImageDataGenerator)、混合训练Mixup、Random Erasing随机擦除、标签平滑正则化、tf.keras.Sequence
  4. 迪杰斯特拉算法 图(邻接矩阵)实现
  5. 计算机学院王国胤,王国胤-中国科学院大学-UCAS
  6. 对标高竞争性,我们郑重纳入了这一机制——MDU价格保障机制之回购
  7. 如何用eclipse将本地项目上传至github
  8. matlab 异步电机发电,[原创]Matlab双馈异步风力发电机建模s函数
  9. python壁纸超清全面屏_这一定是你见过最全面的python重点
  10. JAVA获取某个日期上个月的最后一天