并发危险:解决多线程代码中的 11 个常见的问题
本文将介绍以下内容:
|
本文使用了以下技术: 多线程、.NET Framework |
忘记同步
粒度错误
读写撕裂
无锁定重新排序
重新进入
死锁
锁保护
戳记
两步舞曲
优先级反转
实现安全性的模式
不变性
纯度
隔离
- Forgotten Synchronization(没有同步)
- Incorrect Granularity(不正确的锁粒度)
- Read and Write Tearing(逻辑上我们看上去读写一次内存,但是底层可能因为不对齐/大小原因读写两次)
- Lock-Free Reordering(里面还提到了recursive lock以及死锁问题)
- Lock Convoys(请求锁的速率高于critical section的处理速率,导致很多请求hang住。解决办法就是减少section大小和提高处理速度。这个问题完全是系统设计处理模型造成的,不过确实会造成性能问题)
- Two-Step Dance(发生的情况大致是如果A占有锁lock,但是触发事件造成B被唤醒,而B也需要lock但是没有办法占有,接着又切换回来,造成不必要的cs.不过我倒是觉得这个问题不是很常见的)
- Priority Inversion(这个完全是因为线程优先级造成的。低优先级占有lock,但是高优先级不断地被调度,但是因为lock问题没有成功,同样造成不必要的cs.感觉和上面问题倒是一样,只不过触发条件不同。个人觉得这个问题完全可以在scheduler上解决的)
- Patterns for Achieving Safety(后面三点都是关于获得thread safety的建议)
- Immutablity.数据不可变性
- Purity.方法没有side-effect
- Isolation.每个线程只有自己独立的数据.
static class Counter {internal static int s_curr = 0;internal static int GetNext() { return s_curr++; } }
- 将当前值从共享的 s_curr 变量读入处理器寄存器。
- 递增该寄存器。
- 将寄存器值重新写入共享 s_curr 变量。
static class Counter {internal static volatile int s_curr = 0;internal static int GetNext() { return Interlocked.Increment(ref s_curr); } }
static class Counter {internal static int s_curr = 0;private static object s_currLock = new object();internal static int GetNext() {lock (s_currLock) { return s_curr++; }} }
class BankAccount {private decimal m_balance = 0.0M;private object m_balanceLock = new object();internal void Deposit(decimal delta) {lock (m_balanceLock) { m_balance += delta; }}internal void Withdraw(decimal delta) {lock (m_balanceLock) {if (m_balance < delta)throw new Exception("Insufficient funds");m_balance -= delta;}} }
class BankAccount {internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {Withdraw(a, delta);Deposit(b, delta);}// As before }
class BankAccount {internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {lock (a.m_balanceLock) {lock (b.m_balanceLock) {Withdraw(a, delta);Deposit(b, delta);}}}// As before }
internal static volatile long s_x; void ThreadA() {int i = 0;while (true) {s_x = (i & 1) == 0 ? 0x0L : 0xaaaabbbbccccddddL;i++;} } void ThreadB() {while (true) {long x = s_x;Debug.Assert(x == 0x0L || x == 0xaaaabbbbccccddddL);} }
internal static volatile int s_x = 0; internal static volatile int s_xa = 0; internal static volatile int s_y = 0; internal static volatile int s_ya = 0;void ThreadA() {s_x = 1;s_ya = s_y; }void ThreadB() {s_y = 1;s_xa = s_x; }
void ThreadA() {s_x = 1;Thread.MemoryBarrier();s_ya = s_y; }
class C {private int m_x = 0;private object m_xLock = new object();private Action m_action = ...;internal void M() {lock (m_xLock) {m_x++;try { m_action(); }finally {Debug.Assert(m_x == 1);m_x--;}}} }
class BankAccount {private int m_id; // Unique bank account ID.internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {if (a.m_id < b.m_id) {Monitor.Enter(a.m_balanceLock); // A firstMonitor.Enter(b.m_balanceLock); // ...and then B} else {Monitor.Enter(b.m_balanceLock); // B firstMonitor.Enter(a.m_balanceLock); // ...and then A }try {Withdraw(a, delta);Deposit(b, delta);} finally {Monitor.Exit(a.m_balanceLock);Monitor.Exit(b.m_balanceLock);}}// As before ... }
public class ImmutableStack<T> {private readonly T m_value;private readonly ImmutableStack<T> m_next;private readonly bool m_empty;public ImmutableStack() { m_empty = true; }internal ImmutableStack(T value, Node next) {m_value = value;m_next = next;m_empty = false;}public ImmutableStack<T> Push(T value) {return new ImmutableStack(value, this);}public ImmutableStack<T> Pop(out T value) {if (m_empty) throw new Exception("Empty.");return m_next;} }
- 它只从共享内存读取,并且只读取不变状态或常态。
- 它必须能够写入局部变量。
- 它可以只调用其他纯方法。
英文原文如下:
Concurrency Hazards
Solving 11 Likely Problems In Your Multithreaded Code
Joe Duffy
Concurrency is everywhere. Server-side programs have long had to deal with a fundamentally concurrent programming model, and as multicore processors become more commonplace, client-side programs will have to as well. Along with the addition of concurrency comes the responsibility for ensuring safety. In other words, programs must continue to achieve the same level of robustness and reliability in the face of large amounts of logical concurrency and ever-changing degrees of physical hardware parallelism.
Correctly engineered concurrent code must live by an extra set of rules when compared to its sequential counterpart. Reads and writes from memory and access to shared resources need to be regulated using synchronization so that conflicts do not arise. Additionally, it often becomes necessary for threads to coordinate to get a certain job done.
As a direct result of these additional requirements, it becomes nontrivial to ensure that threads are continually making consistent and adequate forward progress. Synchronization and coordination deeply depend on timing, which is nondeterministic and hard to predict and test for.
What makes these attributes difficult is simply that it requires a mindset shift. There isn't a single API that you can learn or a snippet of code to copy and paste. There really is a fundamental set of concepts that you need to learn and become comfortable with. It's likely that certain languages and libraries can hide some concepts over time, but if you're doing concurrency today, you won't have that luxury. This article describes some of the more common challenges to be aware of and presents advice for coping with them in your software.
I'll start by looking at the kinds of things that often go wrong in concurrent programs. I call these "safety hazards" because they are so easy to stumble upon and because their consequences are typically quite undesirable. These hazards cause your programs to break by either crashing or corrupting memory.
A data race—or race condition—occurs when data is accessed concurrently from multiple threads. Specifically, it happens when one or more threads are writing a piece of data while one or more threads are also reading that piece of data. This problem arises because Windows programs (in C++ and the Microsoft .NET Framework alike) are fundamentally based on the concept of shared memory, where all threads in a process may access data residing in the same virtual address space. Static variables and heap allocation can be used to share.
Consider this canonical example:
//C#static class Counter {internal static int s_curr = 0;internal static int GetNext() { return s_curr++; } }
‘Visual BasicFriend Class CounterFriend Shared s_curr As Integer = 0Friend Shared Function GetNext() As IntegerReturn s_curr = s_curr + 1End Function End Class
The goal of Counter is presumably to hand out a new unique number for each call to GetNext. If two threads in the program both call GetNext simultaneously, however, two threads might be given the same number. The reason is that s_curr++ compiles into three separate steps:
- Read the current value from the shared s_curr variable into a processor register.
- Increment that register.
- Write the register value back to the shared s_curr variable.
Two threads executing this same sequence can both read the same value from s_curr locally (say, 42), increment it (to, say, 43), and publish the same resulting value. GetNext thus returns the same number for both threads, breaking the algorithm. Although the simple statement s_curr++ appears to be atomic, this couldn't be farther from the truth.
Forgotten Synchronization
This is the simplest kind of data race: synchronization is forgotten altogether. Races of this kind can be, in rare situations, benign, meaning that they are correct, but most are fundamental problems of correctness.
Problems like this aren't always so obvious. For example, an object may be part of some large complicated object graph that happens to be reachable from a static variable or became shared by passing an object as part of a closure when creating a new thread or queuing work to a thread pool.
It is imperative to pay attention to when an object (graph) transitions from private to shared. This is called publication and will be discussed later in the context of isolation. The reverse is called privatization—that is, when an object (graph) goes from being shared to private again.
The solution is to add proper synchronization. In the example of the counter, I could use a simple interlocked:
//C#static class Counter {internal static volatile int s_curr = 0;internal static int GetNext() { return Interlocked.Increment(ref s_curr); } }
‘Visual Basic Friend Class CounterFriend Shared s_curr As Integer = 0Friend Shared Function GetNext() As IntegerMonitor.Enter(s_curr)TryReturn Interlocked.Increment(s_curr)FinallyMonitor.Exit(s_curr)End TryEnd Function End Class
This works because the update is confined to a single memory location, and because (conveniently) there is a hardware instruction (LOCK INC) that is equivalent to the software statement I'm trying to make atomic.
Alternatively, I could use a full-fledged lock:
//C#static class Counter {internal static int s_curr = 0;private static object s_currLock = new object();internal static int GetNext() {lock (s_currLock) { return s_curr++; }} }
‘Visual BasicFriend Class CounterFriend Shared s_curr As Integer = 0Private Shared s_currLock As New Object()Friend Shared Function GetNext() As IntegerSyncLock s_currLockReturn s_curr = s_curr + 1End SyncLockEnd Function End Class
The lock statement ensures mutual exclusion among all threads trying to access GetNext, and uses the CLR System.Threading.Monitor class. C++ programs would use a CRITICAL_SECTION for the same purpose. There is no need to go with a lock for this particular example, but when multiple operations are involved it's seldom possible to fold them into a single interlocked operation.
Incorrect Granularity
Even if access to shared state occurs with proper synchronization, the resulting behavior may still be incorrect. The granularity must be sufficiently large that the operations that must be seen as atomic are encapsulated within the region. There is a tension between correctness and making the region smaller, because smaller regions reduce the time that other threads will have to wait to concurrently enter.
For example, take a look at the bank account abstraction shown in Figure 1. All is well, and the object's two methods, Deposit and Withdraw, appear to be concurrency-safe. Some banking application can use them without worry that balances will become corrupt due to concurrent access.
Figure 1 A Bank Account
//C#class BankAccount {private decimal m_balance = 0.0M;private object m_balanceLock = new object();internal void Deposit(decimal delta) {lock (m_balanceLock) { m_balance += delta; }}internal void Withdraw(decimal delta) {lock (m_balanceLock) {if (m_balance < delta)throw new Exception("Insufficient funds");m_balance -= delta;}} }
‘Visual BasicFriend Class BankAccountPrivate m_balance As Decimal = 0DPrivate m_balanceLock As New Object()Friend Sub Deposit(ByVal delta As Decimal)SyncLock m_balanceLockm_balance += deltaEnd SyncLockEnd SubFriend Sub Withdraw(ByVal delta As Decimal)SyncLock m_balanceLockIf m_balance < delta ThenThrow New Exception("Insufficient funds")End Ifm_balance -= deltaEnd SyncLockEnd Sub End Class
What if, however, you'd like to add a Transfer method? A naive (and incorrect) attempt would assume that, because Deposit and Withdraw are safe in isolation, they can be combined easily:
//C#class BankAccount {internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {Withdraw(a, delta);Deposit(b, delta);}// As before }
‘Visual BasicFriend Class BankAccountFriend Shared Sub Transfer(ByVal a As BankAccount, ByVal b As BankAccount, ByVal delta As Decimal)Withdraw(a, delta)Deposit(b, delta)End Sub' As before End Class
This is incorrect. There is actually a period of time between the Withdraw and Deposit calls where the money is missing completely.
A correct implementation would need to acquire the locks on both a and b up front and then make the method calls:
//C#class BankAccount {internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {lock (a.m_balanceLock) {lock (b.m_balanceLock) {Withdraw(a, delta);Deposit(b, delta);}}}// As before }
‘Visual BasicFriend Class BankAccountFriend Shared Sub Transfer(ByVal a As BankAccount, ByVal b As BankAccount, ByVal delta As Decimal)SyncLock a.m_balanceLockSyncLock b.m_balanceLockWithdraw(a, delta)Deposit(b, delta)End SyncLockEnd SyncLockEnd Sub' As before End Class
It turns out that this approach, while solving the granularity problem, is prone to deadlock. You'll see how to fix it later on.
Read and Write Tearing
As mentioned earlier, benign races allow you to access variables without synchronization. Reads and writes of aligned, naturally sized words—for example, pointer-sized things that are 32 bits (4 bytes) on 32-bit processors and 64 bits (8 bytes) on 64-bit processors—are atomic. If a thread is just reading a single variable that some other thread will write, and there aren't any complex invariants involved, you can sometimes skip the synchronization altogether thanks to this guarantee.
But beware. If you try this on a misaligned memory location, or a location that isn't naturally sized, you can encounter a read or write tearing. Tearing occurs because reading or writing such locations actually involves multiple physical memory operations. Concurrent updates can happen in between these, potentially causing the resultant value to be some blend of the before and after values.
As an example, imagine ThreadA sits in a loop and writes only 0x0L and 0xaaaabbbbccccddddL to a 64-bit variable s_x. ThreadB sits in a loop reading it (see Figure 2).
Figure 2 Tearing about to Happen
//C#internal static volatile long s_x; void ThreadA() {int i = 0;while (true) {s_x = (i & 1) == 0 ? 0x0L : 0xaaaabbbbccccddddL;i++;} } void ThreadB() {while (true) {long x = s_x;Debug.Assert(x == 0x0L || x == 0xaaaabbbbccccddddL);} }
//C#internal static volatile long s_x; void ThreadA() {int i = 0;while (true) {s_x = (i & 1) == 0 ? 0x0L : 0xaaaabbbbccccddddL;i++;} } void ThreadB() {while (true) {long x = s_x;Debug.Assert(x == 0x0L || x == 0xaaaabbbbccccddddL);} }
You may be surprised to discover that ThreadB's assert might fire. The reason is that ThreadA's write will consist of two parts, the high 32-bits and the low 32-bits, in some order depending on the compiler. The same goes for ThreadB's read. Thus ThreadB could witness values 0xaaaabbbb00000000L or 0x00000000aaaabbbbL.
Lock-Free Reordering
Sometimes it's tempting to write lock-free code as a way of achieving better scalability and reliability. Doing so requires a deep understanding of the memory model of your target platform (see Vance Morrison's article "Memory Models: Understand the Impact of Low-Lock Techniques in Multithreaded Apps" at msdn.microsoft.com/magazine/cc163715 for details). Failure to understand and heed these rules can lead to memory reordering bugs. These happen because compilers and processors are free to reorder memory operations during the process or making optimizations.
As an example, assume s_x and s_y are both initialized to the value 0, as you see here:
//C#internal static volatile int s_x = 0; internal static volatile int s_xa = 0; internal static volatile int s_y = 0; internal static volatile int s_ya = 0;void ThreadA() {s_x = 1;s_ya = s_y; }void ThreadB() {s_y = 1;s_xa = s_x; }
‘Visual BasicFriend Shared s_x As Integer = 0 Friend Shared s_xa As Integer = 0 Friend Shared s_y As Integer = 0 Friend Shared s_ya As Integer = 0Private Sub ThreadA()s_x = 1;Thread.MemoryBarrier();s_ya = s_y; End SubPrivate Sub ThreadA()s_y = 1;Thread.MemoryBarrier();s_xa = s_x; End Sub
Is it possible that, after ThreadA and ThreadB both run to completion, both s_ya and s_xa will contain the value 0? This seems ridiculous on its face. Either s_x = 1 or s_y = 1 will happen first, in which case the other thread will witness the update when it gets around to its update. Or so the theory goes.
Unfortunately, processors are free to reorder this code so that the loads effectively happen before the writes. You can get around this with an explicit memory barrier:
//C#void ThreadA() {s_x = 1;Thread.MemoryBarrier();s_ya = s_y; }
‘Visual BasicPrivate Sub ThreadA()s_x = 1Thread.MemoryBarrier()s_ya = s_y End Sub
The .NET Framework offers this particular API, and C++ offers _MemoryBarrier and similar macros. But the moral of the story is not that you should insert memory barriers all over the place. The moral is that you should steer clear of lock-free code until you've mastered memory models, and even then tread with care.
In Windows, including Win32 and the .NET Framework, most locks support recursive acquires. That just means that, if the current thread already holds a lock and tries to acquire it again, the request will be satisfied. This makes it easier to compose bigger atomic operations out of smaller ones. In fact, the BankAccount example shown earlier depended on recursive acquires: Transfer called both Withdraw and Deposit, each of which redundantly acquired a lock that Transfer had already acquired.
This can be the source of problems, however, if you end up in a situation where a recursive acquire occurs when you didn't expect it to. This can happen as a result of reentrancy, which is possible due to either explicit callouts to dynamic code (such as virtual methods and delegates) or because of implicitly reentered code (such as STA message pumping and asynchronous procedure calls). For this reason, among others, it's a good idea to never make a call to a dynamic method from within a lock region.
For example, imagine some method that breaks invariants temporarily and then calls a delegate:
//C#class C {private int m_x = 0;private object m_xLock = new object(); private Action m_action = ...;internal void M() {lock (m_xLock) {m_x++;try { m_action(); }finally {Debug.Assert(m_x == 1);m_x--;}}} }
‘Visual BasicFriend Class CPrivate m_x As Integer = 0Private m_xLock As New Object()Private m_action As Action =...Friend Sub M()SyncLock m_xLockm_x += 1Trym_action()FinallyDebug.Assert(m_x = 1)m_x -= 1End TryEnd SyncLockEnd Sub End Class
C's method M ensures that m_x does not change. But there is a brief period of time where m_x is incremented by one before being decremented again. The callout to m_action looks innocent enough. Unfortunately, if it was a delegate accepted from users of the C class, it represents arbitrary code that can do anything it pleases. That includes calling back into the same instance's M method. And if that happens, the assert within the finally is apt to fire; there could be multiple calls to M active on the same stack, even though you didn't do it directly, which clearly can lead to m_x holding a value greater than 1.
When multiple threads encounter a deadlock, the system simply stops responding. Several MSDN Magazinearticles have described how deadlocks occur and some ways of tolerating them—including my own article, "No More Hangs: Advanced Techniques to Avoid and Detect Deadlocks in .NET Apps," atmsdn.microsoft.com/magazine/cc163618, and Stephen Toub's October 2007 .NET Matters column atmsdn.microsoft.com/magazine/cc163352—so the discussion here will be kept light. In summary, whenever a circular wait chain is created—such that some ThreadA is waiting for a resource held by ThreadB, which is also reflexively waiting for a resource held by ThreadA (perhaps indirectly by waiting for a third ThreadC or more)—it is possible for all forward progress to come to a grinding halt.
A common source of this problem is with mutually exclusive locks. In fact, the BankAccount example shown earlier suffers from this problem. If ThreadA tries to transfer $500 from account #1234 to account #5678 at the same time ThreadB tries to transfer $500 from #5678 to #1234, the code can deadlock.
Using a consistent acquisition order can avoid the deadlock, as shown in Figure 3. This logic can be generalized to something called simultaneous lock acquisition, whereby multiple lockable objects are sorted dynamically according to some ordering among locks, so that any place two locks must be held at the same time they are acquired in a consistent order. Another scheme called lock leveling can be used to reject lock acquisitions that are provably done in an inconsistent order.
Figure 3 Consistent Acquisition Order
//C#class BankAccount {private int m_id; // Unique bank account ID.internal static void Transfer(BankAccount a, BankAccount b, decimal delta) {if (a.m_id < b.m_id) {Monitor.Enter(a.m_balanceLock); // A firstMonitor.Enter(b.m_balanceLock); // ...and then B} else {Monitor.Enter(b.m_balanceLock); // B firstMonitor.Enter(a.m_balanceLock); // ...and then A }try {Withdraw(a, delta);Deposit(b, delta);} finally {Monitor.Exit(a.m_balanceLock);Monitor.Exit(b.m_balanceLock);}}// As before ... }
‘Visual BasicFriend Class BankAccountPrivate m_id As Integer ' Unique bank account ID.Friend Shared Sub Transfer(ByVal a As BankAccount, ByVal b As BankAccount, ByVal delta As Decimal)If a.m_id < b.m_id ThenMonitor.Enter(a.m_balanceLock) ' A firstMonitor.Enter(b.m_balanceLock) '...and then BElseMonitor.Enter(b.m_balanceLock) ' B firstMonitor.Enter(a.m_balanceLock) '...and then AEnd IfTryWithdraw(a, delta)Deposit(b, delta)FinallyMonitor.Exit(a.m_balanceLock)Monitor.Exit(b.m_balanceLock)End TryEnd Sub' As before ... End Class
But locks are not the only source of deadlock. Missed wake-ups are another phenomenon, where some event is missed and a thread sleeps forever. This is common with synchronization events like Win32 auto-reset and manual-reset events, CONDITION_VARIABLEs, and CLR Monitor.Wait, Pulse, and PulseAll calls. A missed wake-up is usually a sign of improper synchronization, failure to retest a wait condition, or use of a wake-single primitive (WakeConditionVariable or Monitor.Pulse) when a wake-all (WakeAllConditionVariable or Monitor.PulseAll) would have been more appropriate.
Another common source of this problem is lost signals with auto- and manual-reset events. Because such an event can only be in one state—signaled or nonsignaled—redundant calls to set the event will effectively be ignored. If code assumes that two calls to set always translates to two threads awakened, the result can be a missed wake-up.
Lock Convoys
When the arrival rate at a lock is consistently high compared to its lock acquisition rate, a lock convoy may result. In the extreme, there are more threads waiting at a lock than can be serviced, leading to a catastrophe. This is more common on server-side programs where certain locks protecting data structures needed by most clients can get unusually hot.
For example, imagine this scenario: On average, eight requests arrive per 100 milliseconds. We use eight threads to service requests (because we're on an 8-CPU machine). Each of those eight threads must acquire a lock and hold it for 20 milliseconds before it can do meaningful work.
Unfortunately, access to this lock must be serialized, so it takes 160 milliseconds for all eight threads to enter and leave the lock. After the first exists, there will be 140 milliseconds of time before a ninth thread can access the lock. This scheme inherently will not scale, and there will be a continuously growing backup of requests. Over time, if the arrival rate does not decrease, client requests are apt to begin timing out, and a disaster will ensue.
Fairness in locks is known to contribute to convoys. The reason is that periods of time where the lock could have been made available are artificially blocked out, so that threads arriving must wait until the chosen lock owner thread can wake up, context switch, and acquire and release the lock. Windows has changed all of its internal locks to be unfair over time in order to combat this problem, and CLR monitors are also unfair.
The only real solution to the fundamental convoy problem is to reduce lock hold times and to factor your system so that there are very few (if any) hot locks. This is easier said than done, but is crucial for scalability.
A stampede is a situation where many threads are awakened, such that they all fight for attention simultaneously from the Windows thread scheduler. If you have 100 threads blocked on a single manual-reset event, for example, and you set that event … well, you're apt to have a real mess on your hands, particularly if a large portion of those threads will have to wait again.
One way to implement a blocking queue is to use a manual-reset event that gets unsignaled when the queue is empty, and becomes signaled when it is non-empty. Unfortunately, when there are a large number of waiting threads during the transition from zero elements to one element, a stampede can occur. That's because only one thread will take the single element, which transitions the queue back to empty, and necessarily involves resetting the event. If you had 100 threads waiting, then 99 of them will wake up, context switch (and incur all those cache misses), just to realize they have to wait again.
Two-Step Dance
Sometimes you need to signal an event while holding a lock. This can be unfortunate if the waking thread needs to acquire the lock held, because it will be awakened only to find out that it must wait again. This is wasteful and increases the number of overall context switches. This situation is called the two-step dance, and can extend far beyond just two steps if many locks and events are involved.
Both Win32 and the CLR's condition variable support inherently suffers from the two-step dance problem. It is often unavoidable, or prohibitively difficult to work around.
The two-step dance issue is even worse on a single-processor machine. When events are involved, the kernel will apply a priority boost to the awakening thread. This is nearly guaranteed to preempt the thread setting the event before it has a chance to release the lock. This is two-step dance in the extreme, where the setting ThreadA is context switched out so that the awakening ThreadB can attempt to acquire the lock; it of course cannot, and so it will context switch out so that ThreadA can run again; eventually, ThreadA will release the lock, which again will priority boost ThreadB, which preempts ThreadA so that it can run. As you can see, there's a lot of wasteful context switching going on.
Priority Inversion
Modifying thread priorities is usually asking for trouble. Priority inversion can arise when many threads at different priorities share access to the same locks and resources, whereby a lower-priority thread actually indefinitely holds up the progress of a higher-priority thread. The moral of this story is to simply avoid changing thread priorities wherever possible.
Here is an extreme example of priority inversion. Imagine ThreadA at Low priority acquires some lock L. Then comes along ThreadB, which is at High priority. It tries to acquire L, but cannot because ThreadA holds it. This is the "inversion" part: it's as if ThreadA is artificially and temporarily given a higher priority than ThreadB, simply because it holds a lock that ThreadB needs.
The situation will eventually resolve itself when ThreadA releases the lock. Unfortunately, imagine what happens if ThreadC at Medium priority comes into the picture. Although ThreadC doesn't need lock L, its mere presence may prevent ThreadA from running at all, which indirectly prevents the High-priority ThreadB from running.
Eventually the Windows Balance Set Manager thread will notice this situation. Even if ThreadC remains runnable forever, ThreadA will eventually (after four seconds) receive a temporary priority boost by the OS. Hopefully this is enough to allow it to run to completion and release the lock. But the delay here (four seconds) is huge, and if there is any user interface involved, the application's user is certainly going to notice the problem.
Patterns for Achieving Safety
Now that I've chased down issue after issue, the good news is that there are design patterns you can follow to make the occurrence of many of the above issues (particularly the correctness hazards) far less frequent. The crux of most issues is that state is shared among multiple threads. What's worse, this state is freely manipulated, and it transitions from a consistent state to an inconsistent one and then (hopefully) back again with surprising regularity.
As developers write code for single-threaded programs, all of this makes sense. It's easy to use shared memory as a sort of scratch pad as you make your way towards a valid and final destination. C-style imperative programming languages have worked this way for quite a few years.
But with the rise of more and more concurrency, you need to pay closer attention to these habits. And you can take a cue from functional programming languages like Haskell, LISP, Scheme, ML, and even F# (a new .NET-compliant language), by adopting immutability, purity, and isolation as first class design concepts.
Immutability
An immutable data structure is one that doesn't change after construction. This is a wonderful property for concurrent programs because if data isn't being mutated, then there is no risk of conflict if many threads access it at once. That means synchronization isn't a concern.
Immutability has some support in C++ by way of const, and in C# by way of the read-only modifier. For example, a .NET type that has only read-only fields is shallow immutable. F# creates shallow immutable types by default, unless you use the mutable modifier. Going a step further, if each of those fields itself refers to another type whose fields are all read-only (and only refers to deeply immutable types), then the type is deeply immutable. This results in an entire object graph that is guaranteed to not change out from underneath you, which is very useful indeed.
All of this describes immutability as a static property. It's also possible for objects to be immutable by convention, meaning that there is some guarantee that state won't change during certain periods of time. This is a dynamic property. The Windows Presentation Foundation (WPF) freezable feature implements precisely this, and it also allows concurrent access without synchronization (although it cannot be checked in the same way that static support can). Dynamic immutability is often useful for objects that transition between immutable and mutable throughout the duration of their lifetime.
Immutability does have some downsides. Namely, whenever something has to change, you will need to make a copy of the original object and apply the changes along the way. Additionally, cycles in an object graph are generally not possible (except with dynamic immutability).
For instance, imagine you have an ImmutableStack<T>, as shown in Figure 4. Instead of a set of mutating Push and Pop methods, you will need to return new ImmutableStack<T> objects from them that contain the applied changes. Clever tricks can be used (as with the stack) in some cases to share memory among instances.
Figure 4 Using ImmutableStack
public class ImmutableStack<T> {private readonly T m_value;private readonly ImmutableStack<T> m_next;private readonly bool m_empty;public ImmutableStack() { m_empty = true; }internal ImmutableStack(T value, Node next) {m_value = value;m_next = next;m_empty = false;}public ImmutableStack<T> Push(T value) {return new ImmutableStack(value, this);}public ImmutableStack<T> Pop(out T value) {if (m_empty) throw new Exception("Empty.");return m_next;} }
‘Visual BasicPublic Class ImmutableStack(Of T)Private ReadOnly m_value As TPrivate ReadOnly m_next As ImmutableStack(Of T)Private ReadOnly m_empty As BooleanPublic Sub New()m_empty = TrueEnd SubFriend Sub New(ByVal value As T, ByVal [next] As Node)m_value = valuem_next = [next]m_empty = FalseEnd SubPublic Function Push(ByVal value As T) As ImmutableStack(Of T)Return New ImmutableStack(value, Me)End FunctionPublic Function Pop(<System.Runtime.InteropServices.Out()> ByRef value As T) As ImmutableStack(Of T)If m_empty ThenThrow New Exception("Empty.")End IfReturn m_nextEnd Function End Class
As nodes are pushed, you have to allocate a new object for each one. In a standard linked-list implementation of a stack, you'd have to do this anyway. Notice that as you pop elements from the stack, however, you can use the existing objects. That's because each node in the stack is immutable.
There are already immutable types running rampant in the wild. The CLR's System.String class is immutable, and there is a design guideline that all new value types should be immutable, too. The guidance being given here is to use immutability where is it possible and feels natural, and to resist the temptation to perform mutation simply because it is made convenient by the current generation of languages.
Purity
Even with immutable data types, most operations performed by a program are method calls. And method calls can have side-effects, which are problematic in concurrent code because a side-effect implies mutation of some form. Most often this will be a simple write to shared memory, but it can also be a physically mutating operation like a database transaction, Web service invocation, or file system operation. In many circumstances, I'd like to be able to invoke some method without fear that it will lead to concurrency hazards. Good examples of this are simple methods like GetHashCode and ToString on System.Object. Most people wouldn't expect them to entail side-effects.
A pure method can always be run in a concurrent setting without added synchronization. Although there is no common language support for purity, you can define a pure method very simply:
- It only reads from shared memory, and it only reads
- It can, of course, write to local variables.
- It may only call other pure methods.
- It only reads from shared memory, and it only reads immutable or constant state.
- It can, of course, write to local variables.
- It may only call other pure methods.
The set of things possible within a pure method, thus, is extremely limited. But when combined with immutable types, purity can be made possible and convenient. Some functional languages assume purity by default, most notably Haskell where everything is pure. Anything that must perform a side-effect has to be wrapped in a special thing called a monad. Most of us don't use Haskell, however, so we'll have to get by with purity-by-convention.
Isolation
Publication and privatization were mentioned in passing earlier, but they strike at the heart of a very important issue. Synchronization is necessary—and immutability and purity are interesting—because state is ordinarily shared among multiple threads. But if state is confined within a single thread, then there is no need for synchronization. This leads to more naturally scalable software.
Indeed, if state is isolated, mutation can happen freely. This is convenient because mutation is a fundamental built-in aspect of most C-style languages. Programmers are accustomed to it. It takes discipline to program in a mostly functional style, and is quite difficult for most developers. So give it a try, but don't deceive yourself into thinking the world will become functional overnight.
Ownership is a difficult thing to track down. When does an object become shared? When it is initialized, that is being done by a single thread and the object itself is not reachable from other threads just yet. Once a reference to that object is stored in a static variable, someplace that has been shared at thread creation or queuing time, or a field of an object transitively reachable from one of these places, the object becomes shared. It is imperative that developers watch specifically for these transitions between private and shared, and treat all shared state with care.
Joe Duffy is the Development Lead for Parallel Extensions to .NET at Microsoft. He spends most of his time hacking code, overseeing the design of the library, and managing an amazing team of developers. His new book is Concurrent Programming on Windows.
原链接失效:http://msdn.microsoft.com/en-us/magazine/cc817398.aspx
并发危险:解决多线程代码中的 11 个常见的问题相关推荐
- 21、Java并发性和多线程-Java中的锁
以下内容转自http://ifeve.com/locks/: 锁像synchronized同步块一样,是一种线程同步机制,但比Java中的synchronized同步块更复杂.因为锁(以及其它更高级的 ...
- 五星元老大飞哥,教半年Java实习生小飞飞:优雅解决历史代码中的新需求
事件起因 6月中旬,可爱的产品大飞哥穿着鸿星尔克,帅气极了,过来给我提了一个临时需求,需要我对商品创建/更新业务中由开放平台对接而来的请求做一个Check,如果符合要求,则再做一段稍微复杂的逻辑处理. ...
- 使用JavaSymbolSolver解决Java代码中的方法调用
为什么创建java-symbol-solver? 几年前,我开始使用JavaParser ,然后开始做出贡献. 不久之后,我意识到我们想对Java代码执行的许多操作不能仅通过使用解析器生成的抽象语法树 ...
- python性能解决_我们如何发现并解决Python代码中性能下降的问题
Python部落(python.freelycode.com)组织翻译,禁止转载,欢迎转发. 作者:Omer Lachish 最近,我们已经开始使用RQ库代替Celery库作为我们的任务运行引擎.第一 ...
- 解决matlab代码中打开 摄像头错误
命令区输入指令---使用imaqhwinfo,查看摄像头相关信息 输入 vid = videoinput('winvideo',1) 时提示错误 错误使用 videoinput (line 217 ...
- 语音应用开发中的 11 个常见错误
在海外市场,Amazon Alexa 已经拥有超过 15000 项的技能.为语音助手开发技能俨然成了一门有利可图的生意.事实上,已经出现专门为 Alexa.Cortana 等语音助手开发技能的公司或个 ...
- Java 修饰符详解:吐血整理,带你从底层语法和开发案例疏通这 11 种常见修饰符
文章目录 前言 一.11 种常见的 Java 修饰符 1.1.修饰符应用表 1.2.修饰符访问权限对比 二.public 修饰符 2.1.public 修饰符语法说明 2.2.public 修饰符开发 ...
- Java-多线程-Future、FutureTask、CompletionService、CompletableFuture解决多线程并发中归集问题的效率对比
转载声明 本文大量内容系转载自以下文章,有删改,并参考其他文档资料加入了一些内容: [小家Java]Future.FutureTask.CompletionService.CompletableFut ...
- Java并发编程(8):多线程环境中安全使用集合API(含代码)
在集合API中,最初设计的Vector和Hashtable是多线程安全的.例如:对于Vector来说,用来添加和删除元素的方法是同步的.如果只有一个线程与Vector的实例交互,那么,要求获取和释放对 ...
最新文章
- DCN-2655 gre隧道 vpn 嵌入IPSec配置:
- verilog中数组的定义_开源仿真工具Icarus Verilog中的verilog parser
- QT技术应用_2D绘图(转)
- 初识Activiti
- IDES SAP SEM 4.0/SAP BW 3.50 笔记本安装手册
- 俄国牛人写的开源爬虫xNet
- java XML 通过BeanUtils的population为对象赋值 根据用户选择进行dom4j解析
- 判断一个树是否是平衡二叉树
- 大数据笔记 | HDFS 常用操作命令
- 倍福PLC使用Visualization功能实现可视化界面
- 怎么用python输出百分比_Python 输出百分比
- 曝光一下阿里的工资待遇和职位级别
- kubuntu16.04安装和卸载fcitx输入法
- 手机QQ邮箱可以发html邮件么,如何使用手机qq邮箱发送邮件
- 本地生成RDL报表文件的创建工具
- 使用Perfmon和PAL工具查看Server性能--PerfMon入门指南
- 中国(成都)区块链博物馆为区块链正名
- 笔记本过热、电脑cpu过热、限制CPU运行功率上限,轻松设置解决过热
- 编程思想-模块化-产品模块化设计:产品模块化设计
- java mysql 多表查询_Java编程基础32——MySQL多表联查