原文:
https://blog.feabhas.com/2009/09/mutex-vs-semaphores-%E2%80%93-part-1-semaphores/

Mutex与Semaphore

It never ceases to amaze me how often I see postings in newsgroups, etc. asking the difference between a semaphore and a mutex. Probably what baffles me more is that over 90% of the time the responses given are either incorrect or missing the key differences. The most often quoted response is that of the “The Toilet Example (c) Copyright 2005, Niclas Winquist” . This summarises the differences as:

常常在新闻组种看到有人问信号量与互斥锁之间的区别,这总让我惊讶。也许更令我困惑的是,超过90%的回复要么给出的回答并不正确,或者没有说到点子上。其中被引用最多的这篇:”The Toilet Example(c) Copyright 2005, Niclas Winqist“。以下是我总结的信号量和互斥锁的区别:

A mutex is really a semaphore with value 1

互斥锁就是置为1的信号量

No, no and no again. Unfortunately this kind of talk leads to all sorts of confusion and misunderstanding (not to mention companies like Wind River Systems redefining a mutex as a “Mutual-Exclusion Semaphore” – now where is that wall to bang my head against?).

不是这样的!不是这样的!不是这样的!重要的事情说三遍。不幸的是,这种说法导致了种种困惑和误解(更别提有的公司,如Wind RiverSystem将互斥锁称为”互斥信号量“,快告诉我墙在哪,我要撞墙……)

Firstly we need to clarify some terms and this is best done by revisiting the roots of the semaphore. Back in 1965, Edsger Dijkstra, a Dutch computer scientist, introduced the concept of a binary semaphore into modern programming to address possible race conditions in concurrent programs. His very simple idea was to use a pair of function calls to the operating system to indicate entering and leaving a critical region. This was achieved through the acquisition and release of an operating system resource called a semaphore. In his original work, Dijkstra used the notation of P & V, from the Dutch words Prolagen (P), a neologism coming from To try and lower, and Verhogen (V) To raise, To increase.
首先我们需要厘清一下相关术语,最好的办法就是重新看一下信号量的发展历史。回到1965年,Edsger,Dijkstra,一位荷兰的计算机科学家,在现代编程中引入了二进制信号量的概念,以处理并发程序中可能出现的竞态条件。他的主意很简单,那就是给操作系统增加2个函数,来表示进入临界区/离开临界区。这2个函数是通过获取/释放”信号量“这种系统资源来实现的。在Dijkstra的最初实现中,用了P/V操作的概念。P来自荷兰单词Prolagen(P),是一个表示”测试并减少“的新词,V来自Verhogen(V),表示“增加”。

With this model the first task arriving at the P(S) [where S is the semaphore] call gains access to the critical region. If a context switch happens while that task is in the critical region, and another task also calls on P(S), then that second task (and any subsequent tasks) will be blocked from entering the critical region by being put in a waiting state by the operating system. At a later point the first task is rescheduled and calls V(S) to indicate it has left the critical region. The second task will now be allowed access to the critical region.

在这个模型中,第一个执行到P(S)的任务[这里的S就表示信号量]获得了访问临界区的权限。如果在临界区内发生了上下文切换,其他任务也调用了P(S),那么这个任务(以及其他任务)在进入临界区之前会被阻塞,他们会被系统调度到等待的状态。稍后第一个任务被调度运行,然后调用V(S)表示它已经离开临界区。刚刚被阻塞的第二个任务就会被允许进入临界区。

A variant of Dijkstra’s semaphore was put forward by another Dutchman, Dr. Carel S. Scholten. In his proposal the semaphore can have an initial value (or count) greater than one. This enables building programs where more than one resource is being managed in a given critical region. For example, a counting semaphore could be used to manage the parking spaces in a robotic parking system. The initial count would be set to the initial free parking places. Each time a place is used the count is decremented. If the count reaches zero then the next task trying to acquire the semaphore would be blocked (i.e. it must wait until a parking space is available). Upon releasing the semaphore (A car leaving the parking system) the count is incremented by one.

另一个荷兰人Carel S. Scholten博士提出了Dijkstra的信号量的一个变体。他提出信号量可以设置一个大于1的初始值,这样,在给定的临界区中,程序可以管理多个资源。比方说,计数信号量可以在robotic parking system中用来管理停车位。信号量初始值设为空余停车位个数,每次用掉一个停车位,信号量就自减1.如果信号量的值达到了0,那么下一个请求信号量的任务就会被阻塞(也就是说,必须等有空的停车位出现)。当信号量被释放时(车辆离开停车场),信号量就自增1.

Scholten’s semaphore is referred to as the General or Counting Semaphore, Dijkstra’s being known as the Binary Semaphore.
Scholten的信号量被称为通用信号量或计数信号量,而Dijkstra的则被称为二值信号量。

Pretty much all modern Real-Time Operating Systems (RTOS) support the semaphore. For the majority, the actual implementation is based around the counting semaphore concept. Programmers using these RTOSs may use an initial count of 1 (one) to approximate to the binary semaphore. One of the most notable exceptions is probably the leading commercial RTOS VxWorks from Wind River Systems. This has two separate APIs for semaphore creation, one for the Binary semaphore (semBCreate) and another for the Counting semaphore (semCCreate).

几乎所有现代的实时操作系统(RTOS)都支持信号量。他们的实现多数都基于计数信号量的概念。当需要用二值信号量时,使用这些RTOS的程序员可能会将信号量初始化为1。不过也有例外,其中最有名的也许是商用实时系统的领导者,WInd River Systems出品的VxWorks,它有2套独立的API来创建信号量,二值信号量用semBCreate,计数信号量用semCCreate。

Hopefully we now have a clear understanding of the difference between the binary semaphore and the counting semaphore. Before moving onto the mutex we need to understand the inherent dangers associated with using the semaphore. These include:
* Accidental release
* Recursive deadlock
* Task-Death deadlock
* Priority inversion
* Semaphore as a signal
All these problems occur at run-time and can be very difficult to reproduce; making technical support very difficult.
还好,我们队二值信号量与计数信号量之间的区别有了清晰的认识。在接下来讲到互斥锁之前,我们得了解使用信号量的内在危险:
* 意外释放
* 递归导致死锁
* 任务退出导致死锁
* 优先级反转
* 作为信号的信号量
上述所有问题在运行时都可能出现,而且非常难以复现,使得技术支持非常难做。

Accidental release

This problem arises mainly due to a bug fix, product enhancement or cut-and-paste mistake. In this case, through a simple programming mistake, a semaphore isn’t correctly acquired but is then released.

意外释放

这个问题主要是由于修复bug、产品的改进或复制粘贴错误引起的。由于一个简单的编程错误,任务在尚未正确获取信号量的情况下就将其释放了。


When the counting semaphore is being used as a binary semaphore (initial count of 1 – the most common case) this then allows two tasks into the critical region. Each time the buggy code is executed the count is increment and yet another task can enter. This is an inherent weakness of using the counting semaphore as a binary semaphore.

计数信号量被用做二值信号量(多数情况下该信号量被初始化为1),并允许2个任务进入临界区。每一次执行图中有bug的那部分代码,计数信号量就自增1,于是另一个任务就被允许进入临界区了。这就是用计数信号量来模拟二值信号量的潜在弊端。

Deadlock

Deadlock occurs when tasks are blocked waiting on some condition that can never become true, e.g. waiting to acquire a semaphore that never becomes free. There are three possible deadlock situations associated with the semaphore:

  • Recursive Deadlock
  • Deadlock through Death
  • Cyclic Deadlock (Deadly Embrace)

Here we shall address the first two, but shall return to the cyclic deadlock in a later posting.

死锁

当任务在等待一个永远都不会到来的条件时,就发生了死锁。比如,等待一个永远不会释放的信号量。与信号量相关的死锁有3种:
* 递归导致的死锁
* 任务退出导致的死锁
* 循环等待的死锁(死亡的拥抱)
* 这里我们介绍前2个,在之后的文章中会提到第3种。

Recursive Deadlock

Recursive deadlock can occur if a task tries to lock a semaphore it has already locked. This can typically occur in libraries or recursive functions; for example, the simple locking of malloc being called twice within the framework of a library. An example of this appeared in the MySQL database bug reporting system: Bug #24745 InnoDB semaphore wait timeout/crash – deadlock waiting for itself

递归导致的死锁

如果任务试图获取一个它已经获取过的信号量,就会发生这种死锁。这主要出现在库中或者递归函数中。比方说,在库的框架中对malloc加锁2次。这种情况在MYSQL数据库的bug列表中出现过: Bug #24745 InnoDB semaphore wait timeout/crash – deadlock waiting for itself

Deadlock through Task Death

What if a task that is holding a semaphore dies or is terminated? If you can’t detect this condition then all tasks waiting (or may wait in the future) will never acquire the semaphore and deadlock. To partially address this, it is common for the function call that acquires the semaphore to specify an optional timeout value.

任务退出导致的死锁

如果一个持有信号量的任务退出或被终止,会发生什么呢?如果你无法检测出这种情况,那么所有正在等待该的,以及未来会等待这个信号量的所有任务都会因无法取得信号量而发生死锁。通行的做法可以部分地解决这个问题:给获取信号量的函数设置可选的超时时间。

Priority Inversion

The majority of RTOSs use a priority-driven pre-emptive scheduling scheme. In this scheme each task has its own assigned priority. The pre-emptive scheme ensures that a higher priority task will force a lower priority task to release the processor so it can run. This is a core concept to building real-time systems using an RTOS. Priority inversion is the case where a high priority task becomes blocked for an indefinite period by a low priority task. As an example:

  • An embedded system contains an “information bus”
  • Sequential access to the bus is protected with a semaphore.
  • A bus management task runs frequently with a high priority to move certain kinds of data in and out of the information bus.
  • A meteorological data gathering task runs as an infrequent, low priority task, using the information bus to publish its data. When publishing its data, it acquires the semaphore, writes to the bus, and release the semaphore.
  • The system also contains a communications task which runs with medium priority.
    Very infrequently it is possible for an interrupt to occur that causes the (medium priority) communications task to be sch
    eduled while the (high priority) information bus task is blocked waiting for the (low priority) meteorological data task.
  • In this case, the long-running communications task, having higher priority than the meteorological task, prevents it from running, consequently preventing the blocked information bus task from running.
  • After some time has passed, a watchdog timer goes off, notices that the data bus task has not been executed for some time, concludes that something has gone drastically wrong, and initiate a total system reset.

This well reported event actual sequence of events happened on NASA JPL’s Mars Pathfinder spacecraft.

优先级反转

大部分实时系统都采用了带优先级的、抢占式的调度策略。在该策略中,每个任务有一个优先级,抢占策略确保高优先级的任务可以强制让低优先级的任务让出CPU供其运行。这在构建实时系统是一个关键概念。优先级反转是指高优先级的任务被低优先级的任务阻塞了不确定长度的时间。下面是优先级反转的例子:

  • 有个嵌入式系统,系统中包含一条“信息总线”
  • 用信号量来保护对总线的有序访问
  • 有一个高优先级的总线管理任务,会将特定数据从总线中取出数据或将数据放入总线
  • 另一个低优先级的任务,并不经常运行,用总线来发布其收集的气象学数据。当发布数据时,任务会获取信号量,将数据写入总线,然后释放信号量
  • 系统中还有一个通信任务,优先级为中等。
  • 下面这种情况不常出现,但可能存在:低优先级的气象数据任务持有信号量,高优先级的总线管理任务在等待该信号量。此时中优先级的任务会比高优先级的任务更优先运行。
  • 此时,耗时非常久的通信任务,比气象数据任务优先级高,于是导致后者不能被调度运行,最终使得高优先级的总线信息管理任务得不到运行
  • 过了一些时间以后,看门狗计时器发现数据管理任务持续一段时间没有运行,判断系统出了极大故障,于是将整个系统恢复到初始状态
    这篇报道描述了在 NASA JPL’s Mars Pathfinder spacecraft发生的事。

Semaphore as a Signal

Unfortunately, the term synchronization is often misused in the context of mutual exclusion. Synchronization is, by definition “To occur at the same time; be simultaneous”. Synchronization between tasks is where, typically, one task waits to be notified by another task before it can continue execution (unilateral rendezvous). A variant of this is either task may wait, called the bidirectional rendezvous. This is quite different to mutual exclusion, which is a protection mechanism. However, this misuse has arisen as the counting semaphore can be used for unidirectional synchronization. For this to work, the semaphore is created with a count of 0 (zero).

作为信号的信号量

不幸的是,“同步”这一术语在互斥的语境中经常被误用。根据定义,“同步”是指“同时发生”。任务之间的同步,通常是指一个任务在继续运行下去之前,需要等待其他任务的通知(这是单向的等待)。一个变体是两个任务相互等待。这与互斥不同,互斥是一种保护机制。然而,计数信号量可以用于单向的同步等待,因而出现了很多误用。在这种情况下,信号量的初始值为0.

Note that the P and V calls are not used as a pair in the same task. In the example, assuming Task1 calls the P(S) it will block. When Task 2 later calls the V(S) then the unilateral synchronization takes place and both task are ready to run (with the higher priority task actually running). Unfortunately “misusing” the semaphore as synchronization primitive can be problematic in that it makes debugging harder and increase the potential to miss “accidental release” type problems, as an V(S) on its own (i.e. not paired with a P(S)) is now considered legal code.

In the next posting I shall look at how the mutex address most of the weaknesses of the semaphore.
注意,这里的P操作和V操作不在同一个任务中成对出现。假设任务1调用P(S),那么它将阻塞。当任务2调用V(S)时,就发生了单向的同步。于是(信号量就被释放,译者注)两个任务都可以继续运行了(实际运行的是两个任务中优先级高的那一个)。不幸的是,将信号量作为同步原语,会导致程序难以调试,容易出现“意外释放”之类的问题。因为这个时候,单独出现的V(S)(而不是跟P(S)成对出现)被视为合法了。
在下一篇文章中,我会介绍互斥锁是如何解决上述信号量的多数问题的。

Mutex与Semaphore 第一部分:Semephore相关推荐

  1. 多线程基础之二:mutex和semaphore使用方法

    mutex和semaphore都是内核对象,是用来实现多进程间或多线程锁机制的基础.本文将要介绍两者的使用方式. 0. 多线程锁机制涉及的Windows API 创建mutex内核对象,用来作为线程间 ...

  2. 【嵌入式操作系统】FreeRTOS信号量mutex和semaphore的区别

    今天学习信号量mutex和semaphore的区别,找到了正点原子的博客介绍,讲的挺详细的.建议大家阅读 转载自:https://blog.csdn.net/nippon1218/article/de ...

  3. 多线程同步与并发访问共享资源工具—Lock、Monitor、Mutex、Semaphore

    "线程同步"的含义 当一个进程启动了多个线程时,如果需要控制这些线程的推进顺序(比如A线程必须等待B和C线程执行完毕之后才能继续执行),则称这些线程需要进行"线程同步(t ...

  4. Mutex与Semaphore 第二部分 互斥锁

    原文链接: https://blog.feabhas.com/2009/09/mutex-vs-semaphores-%E2%80%93-part-2-the-mutex/ In Part 1 of ...

  5. java开发中的Mutex vs Semaphore

    先看一下stackoverflow上是怎么说的吧 原文地址:http://stackoverflow.com/questions/771347/what-is-mutex-and-semaphore- ...

  6. 探讨mutex与semaphore

    看过Linux内核的同学都知道,Linux内核中除了有semaphore之外,还有一个mutex lock.前者我们的操作系统教科书称之为信号量,后者不知道教科书有没有具体的名称,但是在Linux内核 ...

  7. windows多线程之读写同步(线程锁Mutex + 信号量Semaphore )

    第一次使用博客,把自己学习的心得记录下来,与大家分享,有什么不足请指正,共同学习! 本文记录的是线程同步的一个经典问题,读写问题.这个场景在实际的应用中很常见,多线程中同时对文件进行读写很容易出问题, ...

  8. Semaphore、CountDownLatch和CyclicBarrier

    这三者都是java并发包的工具类,提供了比synchronized更加高级的各种同步结构,可以实现更加丰富的多线程操作. Semaphore 信号量,我们应该都在操作系统课程里学过,它是解决进程间通信 ...

  9. java 二分搜索获得大于目标数的第一位_Java后端架构师技术图谱,你都了解多少?...

    前言 欢迎工作一到五年的Java工程师朋友们加入我们,私信回复[资料]即可获取我们提供免费的Java架构学习资料(里面有高可用.高并发.高性能及分布式.Jvm性能调优.Spring源码, MyBati ...

最新文章

  1. 反应器(Reactor)模式-golang探索
  2. Python 为什么用 # 号作注释符?
  3. redis支持的数据类型有哪些?
  4. 【JS】Vue.js实现简单的ToDoList(一)——前期准备
  5. PAT (Basic Level) Practice (中文)C++ python 语言实现 —— 题解目录
  6. (十一)企业部分之nagios
  7. 7.4.10 白化 whitening
  8. 生产系统服务器是啥意思,生产系统服务器主机名怎么看
  9. Win7_64位使用32位Mysql配置Mysql Odbc
  10. Bailian2694 逆波兰表达式(POJ NOI0202-1696, POJ NOI0303-1696)【文本】
  11. 如何重启 Windows 10 子系统(WSL) ubuntu
  12. vue 2.0项目中使用tinymce富文本框遇到的问题
  13. 为什么成功启动ngnix之后还是无法用ip地址访问网站
  14. 【PyTorch】CUDA error: device-side assert triggered
  15. 监督学习算法的发展史和它们之间的关系:从文氏图到回归、决策树、支持向量机和人工神经网络
  16. nginx上传文件大小限制
  17. JAVA最佳学习方法
  18. Oracle备份与恢复
  19. (三十)arcpy开发pycharm导入arcpy
  20. yocto源码下载和目录分析

热门文章

  1. 【操作教程】AI安防监控智能视频平台EasyCVR如何重置密码?
  2. 纯CSS 写动画背景,高仿蚂蚁庄园小鸡仔
  3. 迪文T5-T5L使用测试笔记1
  4. Spring Cloud Eureka 自我保护机制(EMERGENCY! EUREKA MAY BE INCORRECTLY CLAIMING INSTANCES ARE UP WHEN THEY)
  5. 学生可以用计算机干什么,好学生用电脑干什么
  6. Java读取串口数据
  7. 谷歌团队在平安金融中心_Google银行业务可以教给我们关于金融和科技的未来
  8. Win11下载速度太慢如何解决?Win11提高下载速度的方法
  9. 企业中流砥柱:别让企业中层缺位
  10. Windows环境下启动zookeeper错误:Zookeeper audit is disabled