2022-10-01发表2022-10-25更新 cx C++1 小时读完 (大约7476个字)0次访问

C++并发实战-第五章-总学习笔记第3

[TOC]
C++ 并发实战《C++ Concurrency in Action》的学习笔记 3, 记录第五章的部分 The C++ memory model and operations on atomic types.
内容是 C++ 内存模型与原子类型及其操作等中层的技术.

Chapter 5 The C++ memory model and operations on atomic types

5.1 Memory model basics

memory model: the basic structural aspects, which relate to how things are laid out in memory, and the concurrency aspects.

5.1.1 Objects and memory locations

The C++ Standard defines an object as a region of storage, although it goes on to assign properties to these objects, such as their type and lifetime. 也存在 sub-objects.

存储单位: an object is stored in one or more memory locations.

下图是一个显示 object 与 memory location 之间关系的示意图:

bf3 大小为 0, 不占据 memory location, 甚至不应该有自己的名字(那一行应该删除, 否则不通过编译): Note how the zero-length bit field bf3 (the name is commented out because zero-length bit fields must be unnamed) separates bf4 into its own memory location, but doesn’t have a memory location itself.

总结为 4 个方面:

Every variable is an object, including those that are members of other objects.
Every object occupies at least one memory location.
Variables of fundamental types such as int or char occupy exactly one memory location, whatever their size, even if they’re adjacent or part of an array.
Adjacent bit fields are part of the same memory location.

5.1.2 Objects, memory locations, and concurrency

race condition 发生的内存微观基础: 多个线程修改同一个 memory location.
解决办法: 确定 some defined ordering(修改顺序): mutex or atomic operations.

5.1.3 Modification orders

要么 programmer 确定 Modification orders, 要么通过 atomic operations 让 compiler 确定.(If you do use atomic operations, the compiler is responsible for ensuring that the necessary synchronization is in place.
)

注意:

Although all threads must agree on the modification orders of each individual object in a program, they don’t necessarily have to agree on the relative order of operations on separate objects.

5.2 Atomic operations and types in C++

定义: An atomic operation is an indivisible operation. You can’t observe such an operation half-done from any thread in the system; it’s either done or not done.

5.2.1 The standard atomic types

实现 automic 的 2 种方式:

直接的: operations on a given type are done directly with atomic instructions (x.is_lock_free() returns true)
间接的: done by using a lock internal to the compiler and library (x.is_lock_free() returns false). => 这种情况下有可能还不如自己在外面用 mutex 来的划算.

C++17 static constexpr member variable, X::is_always_lock_free, which is true if and only if the atomic type X is lock-free for all
supported hardware.

标准也提供了宏方便进行 portable 开发:
ATOMIC_BOOL_LOCK_FREE, ATOMIC_CHAR_LOCK_FREE, ATOMIC_CHAR16_T_LOCK_FREE, ATOMIC_CHAR32_T_LOCK_FREE, ATOMIC_WCHAR_T_LOCK_FREE, ATOMIC_SHORT_LOCK_FREE, ATOMIC_INT_LOCK_FREE, ATOMIC_LONG_LOCK_FREE, ATOMIC_LLONG_LOCK_FREE, and ATOMIC_POINTER_LOCK_FREE.

value 0: if the atomic type is never lock-free.
value 2: if the atomic type is always lock-free.
value 1: if the lock-free status of the corresponding atomic type is a runtime property.

only type that doesn’t provide an is_lock_free() member function is std::atomic_flag. 可以在其基础上开发简单的锁: once you have a simple lock-free Boolean flag, you can use that to implement a simple lock and implement all the other atomic types using that as a basis.
简单的意思是 objects of the std::atomic_flag type are initialized to clear, and they can then either be queried and set (with the test_and_set() member function) or cleared (with the clear() member function). That’s it: no assignment, no copy construction, no test and clear, no other operations at all.

标准规定的 basic atomic types 一览表

a set of typedefs for the atomic types corresponding to the various non-atomic Standard Library typedefs such as std::size_t.

alias 命名规则: standard typedef T, the corresponding atomic type is the same name with an atomic_ prefix: atomic_T. The same applies to the built-in types, except that signed is abbreviated as s, unsigned as u, and long long as llong.

原子类型特点:

not copyable or assignable in the conventional sense, 但是支持通过成员函数实现: load() and store() member functions, exchange(), compare_exchange_weak(), and compare_exchange_strong().
支持 operator: +=, -=, *=, |=, and so on. the integral types and std::atomic<> specializations for ++ and -- pointers support.
返回值 The return value from the assignment operators and member functions is either the value stored (in the case of the assignment operators) or the value prior to the operation (in the case of the named functions).
user-defined type: the operations are limited to load(), store() (and assignment from and conversion to the user-defined type), exchange(), compare_exchange_weak(), and compare_exchange_strong().
optional memory-ordering argument: the default ordering is used, which is the strongest ordering: std::memory_order_seq_cst. operations are divided into three categories:
- Store operations, which can have memory_order_relaxed, memory_order_release, or memory_order_seq_cst ordering.
- Load operations, which can have memory_order_relaxed , memory_order_consume , memory_order_acquire, or memory_order_seq_cst ordering.
- Read-modify-write operations, which can have memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, or memory_order_seq_cst ordering.

5.2.2 Operations on `std::atomic_flag`

特点

one of two states: set or clear.
It’s also the only type guaranteed to be lock-free.
must be initialized with ATOMIC_FLAG_INIT(a clear state).

1	std::atomic_flag f=ATOMIC_FLAG_INIT;

If the std::atomic_flag object has static storage duration, it’s guaranteed to be statically initialized, which means that there are no
initialization-order issues; it will always be initialized by the time of the first operation on the flag.
only three things you can do with it: destroy it, clear it, or set it and query the previous value.
Both the clear()(a store operation) and test_and_set()(read-modify-write operation) member functions can have a memory order specified.

C++ 20 的接口如下:

namespace std {
  struct atomic_flag {
    constexpr atomic_flag() noexcept;
    atomic_flag(const atomic_flag&) = delete;
    atomic_flag& operator=(const atomic_flag&) = delete;
    atomic_flag& operator=(const atomic_flag&) volatile = delete;
 
    bool test(memory_order = memory_order::seq_cst) const volatile noexcept;
    bool test(memory_order = memory_order::seq_cst) const noexcept;
    bool test_and_set(memory_order = memory_order::seq_cst) volatile noexcept;
    bool test_and_set(memory_order = memory_order::seq_cst) noexcept;
    void clear(memory_order = memory_order::seq_cst) volatile noexcept;
    void clear(memory_order = memory_order::seq_cst) noexcept;
 
    void wait(bool, memory_order = memory_order::seq_cst) const volatile noexcept;
    void wait(bool, memory_order = memory_order::seq_cst) const noexcept;
    void notify_one() volatile noexcept;
    void notify_one() noexcept;
    void notify_all() volatile noexcept;
    void notify_all() noexcept;
  };
}

A single operation on two distinct objects can’t be atomic.

应用: 自旋锁

class spinlock_mutex
{
    std::atomic_flag flag;
    public:
        spinlock_mutex():
        flag(ATOMIC_FLAG_INIT){}
        void lock()
        {
            ////does a busy-wait in lock(
            while(flag.test_and_set(std::memory_order_acquire));) 
        }
        void unlock()
        {
            flag.clear(std::memory_order_release);
        }
};

特点: it’s a poor choice if you expect there to be any degree of contention, but it’s enough to ensure mutual exclusion.

5.2.3 Operations on `std::atomic<bool>`

the assignment operators they support return values (of the corresponding non-atomic type) rather than references.

这样就不用对所有依赖于赋值后的引用进行 additional load 操作, 不过 C++ 20 里增加了 std::atomic<ref> 增加对引用类型的原子性操作.

支持操作: store() is a store operation, whereas load() is a load operation. exchange() is a read-modify-write operation.

隐式转换: supports a plain nonmodifying query of the value with an implicit conversion to plain bool or with an explicit call to load().

std::atomic<bool> may not be lock-free.

STORING A NEW VALUE (OR NOT) DEPENDING ON THE CURRENT VALUE

compare-exchange: compare_exchange_weak() and compare_exchange_strong() member functions. 机理: compares the value of the atomic variable with a supplied expected value and stores the supplied desired value if they’re equal. If the values aren’t equal, the expected value is updated with the value of the atomic variable.

区别:

compare_exchange_weak(): the store might not be successful even if the original value was equal to the expected value, in which case the value of the variable is unchanged and the return value of compare_exchange_weak() is false.
原因: This is most likely to happen on machines that lack a single compare-and-exchange instruction, if the processor can’t guarantee that the operation has been done atomically—possibly because the thread performing the operation was switched out in the middle of the necessary sequence of instructions and another thread scheduled in its place by the operating system where there are more threads than processors. This is called a spurious failure. 应对假失败, 使用 loop:

1
2
3

bool expected=false;
extern atomic<bool> b; // set somewhere else
while(!b.compare_exchange_weak(expected,true) && !expected);

compare_exchange_strong() is guaranteed to return false only if the value wasn’t equal to the expected value. This can eliminate the need for
loops.

区别使用: If the calculation of the value to be stored is simple, it may be beneficial to use compare_exchange_weak() in order to avoid a double loop on platforms where compare_exchange_weak() can fail spuriously (and so compare_exchange_strong() contains a loop). On the other hand, if the calculation of the value to be stored is time-consuming, it may make sense to use compare_exchange_strong() to avoid having to recalculate the value to store when the expected value hasn’t changed(因为 && 是短接的, 前者失败不会触发 time-consuming 的后者, 只有前者报 true 才会进行后者).

memory-ordering semantics rules:

differ in the case of success and failure.
A failed compare-exchange doesn’t do a store, so it can’t have memory_order_release or memory_order_acq_rel semantics.
can’t supply stricter memory ordering for failure than for success.
If you don’t specify an ordering for failure, it’s assumed to be the same as that for success, except that the release part of the ordering is stripped.
If you specify neither, they default to memory_order_seq_cst as usual.

5.2.4 Operations on `std::atomic<T*>`: pointer arithmetic

C++ 20 的接口如下:

namespace std {
  template<class T> struct atomic<T*> {
    using value_type = T*;
    using difference_type = ptrdiff_t;
 
    static constexpr bool is_always_lock_free = /* implementation-defined */;
    bool is_lock_free() const volatile noexcept;
    bool is_lock_free() const noexcept;
 
    constexpr atomic() noexcept;
    constexpr atomic(T*) noexcept;
    atomic(const atomic&) = delete;
    atomic& operator=(const atomic&) = delete;
    atomic& operator=(const atomic&) volatile = delete;
 
    void store(T*, memory_order = memory_order::seq_cst) volatile noexcept;
    void store(T*, memory_order = memory_order::seq_cst) noexcept;
    T* operator=(T*) volatile noexcept;
    T* operator=(T*) noexcept;
    T* load(memory_order = memory_order::seq_cst) const volatile noexcept;
    T* load(memory_order = memory_order::seq_cst) const noexcept;
    operator T*() const volatile noexcept;
    operator T*() const noexcept;
 
    T* exchange(T*, memory_order = memory_order::seq_cst) volatile noexcept;
    T* exchange(T*, memory_order = memory_order::seq_cst) noexcept;
    bool compare_exchange_weak(T*&, T*, memory_order, memory_order) volatile noexcept;
    bool compare_exchange_weak(T*&, T*, memory_order, memory_order) noexcept;
    bool compare_exchange_strong(T*&, T*, memory_order, memory_order) volatile noexcept;
    bool compare_exchange_strong(T*&, T*, memory_order, memory_order) noexcept;
    bool compare_exchange_weak(T*&, T*,
                               memory_order = memory_order::seq_cst) volatile noexcept;
    bool compare_exchange_weak(T*&, T*,
                               memory_order = memory_order::seq_cst) noexcept;
    bool compare_exchange_strong(T*&, T*,
                                 memory_order = memory_order::seq_cst) volatile noexcept;
    bool compare_exchange_strong(T*&, T*,
                                 memory_order = memory_order::seq_cst) noexcept;
 
    T* fetch_add(ptrdiff_t, memory_order = memory_order::seq_cst) volatile noexcept;
    T* fetch_add(ptrdiff_t, memory_order = memory_order::seq_cst) noexcept;
    T* fetch_sub(ptrdiff_t, memory_order = memory_order::seq_cst) volatile noexcept;
    T* fetch_sub(ptrdiff_t, memory_order = memory_order::seq_cst) noexcept;
 
    T* operator++(int) volatile noexcept;
    T* operator++(int) noexcept;
    T* operator--(int) volatile noexcept;
    T* operator--(int) noexcept;
    T* operator++() volatile noexcept;
    T* operator++() noexcept;
    T* operator--() volatile noexcept;
    T* operator--() noexcept;
    T* operator+=(ptrdiff_t) volatile noexcept;
    T* operator+=(ptrdiff_t) noexcept;
    T* operator-=(ptrdiff_t) volatile noexcept;
    T* operator-=(ptrdiff_t) noexcept;
 
    void wait(T*, memory_order = memory_order::seq_cst) const volatile noexcept;
    void wait(T*, memory_order = memory_order::seq_cst) const noexcept;
    void notify_one() volatile noexcept;
    void notify_one() noexcept;
    void notify_all() volatile noexcept;
    void notify_all() noexcept;
  };
}

pointer arithmetic operations, provided by the fetch_add() and fetch_sub() member functions and the += and -= operators, ++ and --.

exchange-and-add: return the original value (so x.fetch_add(3) will update x to point to the fourth value but return a pointer to the first value in the array. it’s an atomic read-modify-write operation. the return value is a plain T* value rather than a reference to the std::atomic<T*> object.

5.2.5 Operations on standard atomic integral types

C++ 20 的接口如下:

namespace std {
  template<> struct atomic</* integral */> {
    using value_type = /* integral */;
    using difference_type = value_type;
 
    static constexpr bool is_always_lock_free = /* implementation-defined */;
    bool is_lock_free() const volatile noexcept;
    bool is_lock_free() const noexcept;
 
    constexpr atomic() noexcept;
    constexpr atomic(/* integral */) noexcept;
    atomic(const atomic&) = delete;
    atomic& operator=(const atomic&) = delete;
    atomic& operator=(const atomic&) volatile = delete;
 
    void store(/* integral */, memory_order = memory_order::seq_cst) volatile noexcept;
    void store(/* integral */, memory_order = memory_order::seq_cst) noexcept;
    /* integral */ operator=(/* integral */) volatile noexcept;
    /* integral */ operator=(/* integral */) noexcept;
    /* integral */ load(memory_order = memory_order::seq_cst) const volatile noexcept;
    /* integral */ load(memory_order = memory_order::seq_cst) const noexcept;
    operator /* integral */() const volatile noexcept;
    operator /* integral */() const noexcept;
 
    /* integral */ exchange(/* integral */,
                        memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ exchange(/* integral */,
                        memory_order = memory_order::seq_cst) noexcept;
    bool compare_exchange_weak(/* integral */&, /* integral */,
                               memory_order, memory_order) volatile noexcept;
    bool compare_exchange_weak(/* integral */&, /* integral */,
                               memory_order, memory_order) noexcept;
    bool compare_exchange_strong(/* integral */&, /* integral */,
                                 memory_order, memory_order) volatile noexcept;
    bool compare_exchange_strong(/* integral */&, /* integral */,
                                 memory_order, memory_order) noexcept;
    bool compare_exchange_weak(/* integral */&, /* integral */,
                               memory_order = memory_order::seq_cst) volatile noexcept;
    bool compare_exchange_weak(/* integral */&, /* integral */,
                               memory_order = memory_order::seq_cst) noexcept;
    bool compare_exchange_strong(/* integral */&, /* integral */,
                                 memory_order = memory_order::seq_cst) volatile noexcept;
    bool compare_exchange_strong(/* integral */&, /* integral */,
                                 memory_order = memory_order::seq_cst) noexcept;
 
    /* integral */ fetch_add(/* integral */,
                         memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ fetch_add(/* integral */,
                         memory_order = memory_order::seq_cst) noexcept;
    /* integral */ fetch_sub(/* integral */,
                         memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ fetch_sub(/* integral */,
                         memory_order = memory_order::seq_cst) noexcept;
    /* integral */ fetch_and(/* integral */,
                         memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ fetch_and(/* integral */,
                         memory_order = memory_order::seq_cst) noexcept;
    /* integral */ fetch_or(/* integral */,
                         memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ fetch_or(/* integral */,
                         memory_order = memory_order::seq_cst) noexcept;
    /* integral */ fetch_xor(/* integral */,
                         memory_order = memory_order::seq_cst) volatile noexcept;
    /* integral */ fetch_xor(/* integral */,
                         memory_order = memory_order::seq_cst) noexcept;
 
    /* integral */ operator++(int) volatile noexcept;
    /* integral */ operator++(int) noexcept;
    /* integral */ operator--(int) volatile noexcept;
    /* integral */ operator--(int) noexcept;
    /* integral */ operator++() volatile noexcept;
    /* integral */ operator++() noexcept;
    /* integral */ operator--() volatile noexcept;
    /* integral */ operator--() noexcept;
    /* integral */ operator+=(/* integral */) volatile noexcept;
    /* integral */ operator+=(/* integral */) noexcept;
    /* integral */ operator-=(/* integral */) volatile noexcept;
    /* integral */ operator-=(/* integral */) noexcept;
    /* integral */ operator&=(/* integral */) volatile noexcept;
    /* integral */ operator&=(/* integral */) noexcept;
    /* integral */ operator|=(/* integral */) volatile noexcept;
    /* integral */ operator|=(/* integral */) noexcept;
    /* integral */ operator^=(/* integral */) volatile noexcept;
    /* integral */ operator^=(/* integral */) noexcept;
 
    void wait(/* integral */,
              memory_order = memory_order::seq_cst) const volatile noexcept;
    void wait(/* integral */, memory_order = memory_order::seq_cst) const noexcept;
    void notify_one() volatile noexcept;
    void notify_one() noexcept;
    void notify_all() volatile noexcept;
    void notify_all() noexcept;
  };
}

atomic integral values are typically used either as counters or as bitmasks.

5.2.6 The `std::atomic<>` primary class template

std::atomic<UDT> for some user-defined type UDT 的要求: must have a trivial copy-assignment operator.

must not have any virtual functions or
virtual base classes and
must use the compiler-generated copy-assignment operator.
every base class and non-static data member of a user-defined type must also have a trivial copy-assignment operator.
This permits the compiler to use memcpy() or an equivalent operation for assignment operations, because there’s no user-written code to run.
compare-exchange operations do bitwise comparison as if using memcmp, rather than using any comparison operator that may be defined for UDT. 注意 padding bits 的存在.

为什么要这么要求?

In general, the compiler isn’t going to be able to generate lock-free code for std::atomic<UDT>, so it will have to use an internal lock for all the operations. If user-supplied copy-assignment or comparison operators were permitted, this would require passing a reference to the protected data as an argument to a user-supplied function, violating the guideline(不要随意使用指针/引用作为 shared data 的接口).
Also, the library is entirely at liberty to use a single lock for all atomic operations that need it, and allowing user-supplied functions to be called while holding that lock might cause deadlock or cause other threads to block because a comparison operation took a long time.
Finally, these restrictions increase the chance that the compiler will be able to make use of atomic instructions directly for std::atomic<UDT> (and make a particular instantiation lock-free), because it can treat the user-defined type as a set of raw bytes.

C++ 20 之前不支持 std::atomic<floating-point_types>, 原因是: The operation(compare_exchange_strong) may fail even though the old stored value was equal in value to the comparand, if the stored value had a different representation.

虽然 C++ 20 支持了浮点数, 但是参考意义还是存在的, You’ll get similar behavior with compare_exchange_strong if you use std::atomic<> with a user-
defined type that has an equality-comparison operator defined, and that operator differs from the comparison using memcmp—the operation may fail because the otherwise-equal values have a different representation.

double-word-compare-and-swap (DWCAS) instruction: Platforms use atomic instructions for user-defined types that are twice the size of an int or void*.
小一些的类型: If your UDT is the same size as (or smaller than) an int or a void*, most common platforms will be able to use atomic instructions for std::atomic<UDT>.

对于容器: 虽然不能直接实例化容器对象, 但是可以实例化含有指针的容器. you can’t create std::atomic<std::vector<int>>, but you can instantiate std::atomic<> with classes containing counters or flags or pointers or even arrays of simple data elements.

Table 5.3 The operations available on atomic types

5.2.7 Free functions for atomic operations

All the free functions take a pointer to the atomic object as the first parameter. 例子 the equivalent of a.load(std::memory_order_acquire) is
std::atomic_load_explicit(&a, std::memory_order_acquire)(specify the memory ordering again have the _explicit suffix), 只有 std::atomic_flag 比较特殊: std::atomic_flag_test_and_set().

The free functions are designed to be C-compatible, so they use pointers rather than references in all cases.

对智能指针的支持: The atomic operations available are load, store, exchange, and compare-exchange, which are provided as overloads of the same operations on the standard atomic types, taking an std::shared_ptr<>* as the first argument. 例子如下:

std::shared_ptr<my_data> p;
void process_global_data()
{
    std::shared_ptr<my_data> local=std::atomic_load(&p);
    process_data(local);
}

void update_global_data()
{
    std::shared_ptr<my_data> local(new my_data);
    std::atomic_store(&p,local);
}

The Concurrency TS 也提供了 std::experimental::atomic_shared_ptr<T> 更推荐使用这个. Even if it is not lock-free(但还是有可能是的), std::experimental::atomic_shared_ptr is to be recommended over using the atomic free functions on a plain std::shared_ptr. 原因是 clearer in your code.

5.3 Synchronizing operations and enforcing ordering

happens-before and synchronizes-with

happens-before is transitive.

#include <vector>
#include <atomic>
#include <iostream>
std::vector<int> data;
std::atomic<bool> data_ready(false);
void reader_thread()
{
    while(!data_ready.load())
    {
        std::this_thread::sleep(std::chrono::milliseconds(1));
    }
    std::cout<<”The answer=”<<data[0]<<”\n”;
}
void writer_thread()
{
    data.push_back(42);
    data_ready=true;
}

5.3.1 The synchronizes-with relationship

The synchronizes-with relationship is something that you can get only between operations on atomic types.

机制: a suitably-tagged atomic write operation, W,
on a variable, x,
synchronizes with
a suitably-tagged atomic read operation on x that reads the value
stored by either that write, W,
or a subsequent atomic write operation on x by the same thread that performed the initial write, W,
or a sequence of atomic read-modify-write operations on x (such as fetch_add() or compare_exchange_weak()) by any thread,
where the value read by the first thread in the sequence is the value written by W.

直观地说: if thread A stores a value and thread B reads that value, there’s a synchronizes-with relationship between the store in thread A and the load in thread B.

5.3.2 The happens-before relationship

没有顺序的一种情况: If the operations occur in the same statement, in general there’s no happens-before relationship between them, because they’re unordered.(当然也有一个 statement 中顺序执行的情况, 例如 built-in comma operator 分隔, 前后相互依赖等)

线程间的顺序: if operation A in one thread synchronizes with operation B in another thread, then A inter-thread happens before B. It’s also a transitive relation. 应用: if you make a series of changes to data in a single thread, you need only one synchronizes-with relationship for the data to be visible to subsequent operations on the thread that executed C.

strongly-happens-before relationship 不同之处: operations tagged with memory_order_consume(此模式下无法实现 strong) participate in inter-thread-happens-before relationships (and thus happens-before relationships), but not in strongly-happens-before relationships.

5.3.3 Memory ordering for atomic operations

represent three models:

sequentially consistent ordering (memory_order_seq_cst),
acquire-release ordering (memory_order_consume, memory_order_acquire, memory_order_release, and memory_order_acq_rel), and
relaxed ordering (memory_order_relaxed).

区别: These distinct memory-ordering models can have varying costs on different CPU architectures.
细分的作用: allows experts to take advantage of the increased performance of the more fine-grained ordering relationships.

happens-before 与 synchronizes-with 的解读可以参考链接.

SEQUENTIALLY CONSISTENT ORDERING

It implies that the behavior of the program is consistent with a simple sequential view of the world.
All threads must see the same order of operations.
对现实的指导意义: You can write down all the possible sequences of operations by different threads, eliminate those that are inconsistent, and verify that your code behaves as expected in the others.

注意, 对 atomic operation 的设定不具备传递性, 此处设定了, 接下来的别处还需要再设定. This constraint doesn’t carry forward to threads that use atomic operations with relaxed memory orderings; they can still see the operations in a different order, so you must use sequentially consistent operations on all your threads. 有些顺序的限制(依赖关系)是可以 carry forward 的.

缺点: 在 a weakly-ordered machine with many processors 的情况下, 代价会非常高. 需不需要使用需要我们自己查阅 documentation for your target processor architectures.

例子代码(在原书的基础上增加打印):

#include <atomic>
#include <thread>
#include <cassert>
#include <iostream>

std::atomic<bool> x, y;
std::atomic<int> z;
void write_x()
{
    x.store(true, std::memory_order_seq_cst);
    printf("Thread 1: x store \n");
}
void write_y()
{
    y.store(true, std::memory_order_seq_cst);
    printf("Thread 4: y store \n");
}
void read_x_then_y()
{
    printf("Thread 2: enter\n");
    
    while (!x.load(std::memory_order_seq_cst))
        printf("Thread 2: x load fail\n");
    
    printf("Thread 2: x load success\n");
    if (y.load(std::memory_order_seq_cst))
    {
        ++z;
        printf("Thread 2: ++z \n");
    }
}
void read_y_then_x()
{
    
    printf("Thread 3: enter\n");
    
    while (!y.load(std::memory_order_seq_cst))
        printf("Thread 3: y load fail\n");
    
    printf("Thread 3: y load success\n");
    if (x.load(std::memory_order_seq_cst))
    {
        ++z;
        printf("Thread 3: ++z \n");
    }
}

int main()
{
    x = false;
    y = false;
    z = 0;
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);
    a.join();
    b.join();
    c.join();
    d.join();
    assert(z.load() != 0);
}

assert(z.load()!=0); 永远不会 true. 执行过程如下:

一些可能的执行顺序(x86_64, gcc 5.4.0, 未加优化):

Thread 1: x store 
Thread 4: y store 
Thread 2: enter
Thread 2: x load success
Thread 2: ++z 
Thread 3: enter
Thread 3: y load success
Thread 3: ++z
#不同执行输出结果的分隔
---
Thread 1: x store 
Thread 3: enter
Thread 3: y load success
Thread 3: ++z 
Thread 2: enter
Thread 2: x load success
Thread 2: ++z 
Thread 4: y store
---
Thread 4: y store 
Thread 1: x store 
Thread 2: enter
Thread 2: x load success
Thread 2: ++z 
Thread 3: enter
Thread 3: y load success
Thread 3: ++z
---
Thread 2: enter
Thread 2: x load fail
Thread 2: x load fail
Thread 2: x load success
Thread 2: ++z 
Thread 4: y store 
Thread 3: enter
Thread 3: y load success
Thread 3: ++z 
Thread 1: x store # 打印被延迟了

此处需要注意的是打印的结果只能做为大致的参考, 因为打印命令时间可能是被延迟的, 例如最后一个输出结果.

整个过程我是这么思考的, 因为加了 Sequential consistency 选项, 形成一个 a single total order: 两个线程看到的都是无论对于 x, 还是 y, 都是状态一致的. 例如, x 先被 store 为 true y 没有被 store, read_x_then_y 可以跳出 while 但是由于 y 是 false 所以 ++z 不能实现就结束了. 与此同时在另一个线程 read_y_then_x 里, 也是如此的顺序, 先是 x == true 且 y == false 因此卡在 while 上, 但当 y store 成 true 跳出 while 后, x 保证还是 true 而不是混乱地有可能取得 x 的值是初始的 false 因此能顺利地 ++z. 不管 x 与 y 谁先 store 成 true 总可以有一个线程导致 ++z, 当然也有可能都会 ++z.

总结: Sequential consistency is the most straightforward and intuitive ordering, but it’s also the most expensive memory ordering.

NON-SEQUENTIALLY CONSISTENT MEMORY ORDERINGS

there’s no longer a single global order of events: different threads can see different views of the same operations.

threads don’t have to agree on the order of events.

原因是: It’s not just that the compiler can reorder the instructions. Even if the threads are running the same bit of code, they can disagree on the order of events because of operations in other threads in the absence of explicit ordering constraints, because the different CPU caches and internal buffers can hold different values for the same memory.

In the absence of other ordering constraints, the only requirement is that all threads agree on the modification order of each individual variable.

RELAXED ORDERING

Operations on the same variable within a single thread still obey happens-before relationships, but there’s almost no requirement on ordering relative to other threads. 一个线程上有顺序, 线程间没有任何可以依赖的顺序.

The only requirement is that accesses to a single atomic variable from the same thread can’t be reordered; 例如: once a given thread has seen a particular value of an atomic variable, a subsequent read by that thread can’t retrieve an earlier value of the variable.

#include <atomic>
#include <thread>
#include <assert.h>
std::atomic<bool> x,y;
std::atomic<int> z;

void write_x_then_y()
{
    x.store(true,std::memory_order_relaxed);
    y.store(true,std::memory_order_relaxed);
}

void read_y_then_x()
{
    while(!y.load(std::memory_order_relaxed));
    if(x.load(std::memory_order_relaxed))
        ++z;
}

int main() {
    x=false;
    y=false;
    z=0;
    std::thread a(write_x_then_y);
    std::thread b(read_y_then_x);
    a.join();
    b.join();
    assert(z.load()!=0);
}
f

这个例子中在 write_x_then_y 函数中线程 a 可以保证执行顺序, 但在另一个线程 b 中看到的顺序则不一定, 因此有可能出现 y 为 true 的情况下, x 依旧未被 store 为 true 的情况, 因此 ++z 未能得到执行.

#include <thread>
#include <atomic>
#include <iostream>

std::atomic<int> x(0),y(0),z(0);
std::atomic<bool> go(false);
unsigned const loop_count=10;

struct read_values
{
    int x,y,z;
};
read_values values1[loop_count];
read_values values2[loop_count];
read_values values3[loop_count];
read_values values4[loop_count];
read_values values5[loop_count];
void increment(std::atomic<int>* var_to_inc,read_values* values)
{
    while(!go)
        std::this_thread::yield();
    for(unsigned i=0;i<loop_count;++i)
    {
        values[i].x=x.load(std::memory_order_relaxed);
        values[i].y=y.load(std::memory_order_relaxed);
        values[i].z=z.load(std::memory_order_relaxed);
        var_to_inc->store(i+1,std::memory_order_relaxed);
        std::this_thread::yield();
    }
}

void read_vals(read_values* values)
{
    while(!go)
        std::this_thread::yield();
    for(unsigned i=0;i<loop_count;++i)
    {
        values[i].x=x.load(std::memory_order_relaxed);
        values[i].y=y.load(std::memory_order_relaxed);
        values[i].z=z.load(std::memory_order_relaxed);
        std::this_thread::yield();
    }
}

void print(read_values* v)
{
    for(unsigned i=0;i<loop_count;++i)
    {
        if(i)
            std::cout<<",";
        std::cout<<"("<<v[i].x<<","<<v[i].y<<","<<v[i].z<<")";
    }
    std::cout<<std::endl;
}

int main()
{
    std::thread t1(increment,&x,values1);
    std::thread t2(increment,&y,values2);
    std::thread t3(increment,&z,values3);
    std::thread t4(read_vals,values4);
    std::thread t5(read_vals,values5);
    go=true;
    t5.join();
    t4.join();
    t3.join();
    t2.join();
    t1.join();
    print(values1);
    print(values2);
    print(values3);
    print(values4);
    print(values5);
}

一种可能的输出是:

(0,0,0),(1,0,0),(2,0,0),(3,0,0),(4,0,0),(5,7,0),(6,7,8),(7,9,8),(8,9,8),(9,9,10)
(0,0,0),(0,1,0),(0,2,0),(1,3,5),(8,4,5),(8,5,5),(8,6,6),(8,7,9),(10,8,9),(10,9,10)
(0,0,0),(0,0,1),(0,0,2),(0,0,3),(0,0,4),(0,0,5),(0,0,6),(0,0,7),(0,0,8),(0,0,9)
(1,3,0),(2,3,0),(2,4,1),(3,6,4),(3,9,5),(5,10,6),(5,10,8),(5,10,10),(9,10,10),(10,10,10)
(0,0,0),(0,0,0),(0,0,0),(6,3,7),(6,5,7),(7,7,7),(7,8,7),(8,8,7),(8,8,9),(8,8,9)

反映出一些规律:

std::memory_order_relaxed 模式下, x,y,z(还有 var_to_inc)的执行顺序在单个线程里是确定的: 这可以在所有行的输出中的任意 x, y, z 之一的变量的递增顺序可以看出来.
由于 var_to_inc 的作用, 前三行 increment 的输出依次按照 x, y, z 的顺序分别从 0 递增到 9, 也是反映了上面的点.
线程间除了自己线程上的操作是连续的, 看到其他线程上的值都是不连续的, 不同线程看到的结果也不一样. 例如第三行压根没有看到前两个线程的输出, 但后面两个 read_vals 虽然看到了, 但看到的结果不一样.
基本的时间规律还是被遵守的, 任何一行的输出不存在减小的情况, 最多不更新或者是跨越式更新.
只要是满足上面的规律, 其他的输出结果都是合法的.

UNDERSTANDING RELAXED ORDERING

本小节, 作者举出了一个非常贴切生动的例子说明 std::memory_order_relaxed 模式. 我直接粘贴在此处.

imagine that each variable is a man in a cubicle with a notepad. On his notepad is a list of values. You can phone him and ask him to give you
a value, or you can tell him to write down a new value. If you tell him to write down a new value, he writes it at the bottom of the list. If you ask him for a value, he reads you a number from the list.

The first time you talk to this man, if you ask him for a value, he may give you any value from the list he has on his pad at the time. If you then ask him for another value, he may give you the same one again or a value from farther down the list. He’ll never give you a value from farther up the list. If you tell him to write down a number and then subsequently ask him for a value, he’ll give you either the number you told him to write down or a number below that on the list.

Imagine for a moment that his list starts with the values 5, 10, 23, 3, 1, and 2. If you ask for a value, you could get any of those. If he gives you 10, then the next time you ask he could give you 10 again, or any of the later ones, but not 5. If you call him five times, he could say “10, 10, 1, 2, 2,” for example. If you tell him to write down 42, he’ll add it to the end of the list. If you ask him for a number again, he’ll keep telling you “42” until he has another number on his list and he feels like telling it to you. Now, imagine your friend Carl also has this man’s number. Carl can also phone him and either ask him to write down a number or ask for one, and he applies the same rules to Carl as he does to you. He has only one phone, so he can only deal with one of you at a time, so the list on his pad is a nice straightforward list. But just because you got him to write down a new number doesn’t mean he has to tell it to Carl, and vice versa. If Carl asked him for a number and was told “23,” then just because you asked the man to write down 42 doesn’t mean he’ll tell that to Carl next time. He may tell Carl any of the numbers 23, 3, 1, 2, 42, or even the 67 that Fred told him to write down after you called. He could very well tell Carl “23, 3, 3, 1, 67” without
being inconsistent with what he told you. It’s like he keeps track of which number he told to whom with a little moveable sticky note for each person, like in figure 5.5.

Now imagine that there’s not just one man in a cubicle but a whole cubicle farm, with loads of men with phones and notepads. These are all our atomic variables. Each variable has its own modification order (the list of values on the pad), but there’s no relationship between them at all. If each caller (you, Carl, Anne, Dave, and Fred) is a thread, then this is what you get when every operation uses memory_order_relaxed. There are a few additional things you can tell the man in the cubicle, such as “Write down this number, and tell me what was at the bottom of the list” (exchange) and “Write down this number if the number on the bottom of the list is that; otherwise tell me what I should have guessed” (compare_exchange_strong), but that doesn’t affect the general principle.

ACQUIRE-RELEASE ORDERING

成对使用: Synchronization is pairwise between the thread that does the release and the thread that does the acquire. A release operation synchronizes-with an acquire operation that reads the value written.

#include <atomic>
#include <thread>
#include <assert.h>

std::atomic<bool> x,y;
std::atomic<int> z;

void write_x_then_y()
{
    x.store(true,std::memory_order_relaxed);
    y.store(true,std::memory_order_release);
}

void read_y_then_x()
{
    while(!y.load(std::memory_order_acquire));
    if(x.load(std::memory_order_relaxed))
        ++z;
}

int main()
{
    x=false;
    y=false;
    z=0;
    std::thread a(write_x_then_y);
    std::thread b(read_y_then_x);
    a.join();
    b.join();
    assert(z.load()!=0);
}

此例中 assert 也不会为真. 因为使用了 acquire-release semantics.

注意 read_y_then_x 中 while loop 的作用, 如果没有的话, assert 是有可能 fire 的.

作者对上面举出的 men with notepads in their cubicles 的例子中增加了 batch 的概念来解释 acquire-release semantics.

同时 acquire-release semantics 具有传递性, 因此可以形成一连串的多个线程间的同步关系: it’s transitive: if A inter-thread happens before B and B inter-thread happens before C, then A inter-thread happens before C.

TRANSITIVE SYNCHRONIZATION WITH ACQUIRE-RELEASE ORDERING

本小结作者提出了 sequence before 的概念, 可以简单地理解为单个线程上操作(statement)的顺序关系.

#include <atomic>
#include <thread>
#include <assert.h>

std::atomic<int> data[5];
std::atomic<bool> sync1(false),sync2(false);

void thread_1()
{
    data[0].store(42,std::memory_order_relaxed);
    data[1].store(97,std::memory_order_relaxed);
    data[2].store(17,std::memory_order_relaxed);
    data[3].store(-141,std::memory_order_relaxed);
    data[4].store(2003,std::memory_order_relaxed);
    sync1.store(true,std::memory_order_release);
}

void thread_2()
{
    while(!sync1.load(std::memory_order_acquire));
    sync2.store(std::memory_order_release);
}

void thread_3()
{
    while(!sync2.load(std::memory_order_acquire));
    assert(data[0].load(std::memory_order_relaxed)==42);
    assert(data[1].load(std::memory_order_relaxed)==97);
    assert(data[2].load(std::memory_order_relaxed)==17);
    assert(data[3].load(std::memory_order_relaxed)==-141);
    assert(data[4].load(std::memory_order_relaxed)==2003);
}

int main()
{
    std::thread t1(thread_1);
    std::thread t2(thread_2);
    std::thread t3(thread_3);
    t1.join();
    t2.join();
    t3.join();
}

借助 acquire-release semantics 的传递性可以构建 chain: data store happens before sync1 store happens before sync1 load happens before sync2 store happens before sync2 load happens before data load.

可以把 sync1 与 sync2 组合在一起, 使用 option std::memory_order_acq_rel, 即先 acquire 再 release, 同时要使用 read-modify-write operation 来进行同步与判断, 改进后的程序如下:

std::atomic<int> sync(0);
void thread_1()
{
    // ...
    sync.store(1,std::memory_order_release);
}
void thread_2()
{
    int expected=1;
    while(!sync.compare_exchange_strong(expected,2,std::memory_order_acq_rel))
        expected=1;
}
void thread_3()
{
    while(sync.load(std::memory_order_acquire)<2);
    // ...
}

其他的原子操作也可以与 acquire-release semantics 的其他 option 组合, 但有意义与否, 如何使用需要 programmer 仔细判断. 例子 A fetch_sub operation with memory_order_acquire semantics doesn’t synchronize with anything, even though it stores a value, because it isn’t a release operation.

acquire-release semantics 与其他 options 之间的组合:

If you mix acquire-release operations with sequentially consistent operations,
- the sequentially consistent loads behave like loads with acquire semantics, and
- sequentially consistent stores behave like stores with release semantics.
- Sequentially consistent read-modify-write operations behave as both acquire and release operations.
Relaxed operations are still relaxed but are bound by the additional synchronizes-with and consequent happens-before relationships introduced through the use of acquire-release semantics.

其实 mutex 本身也是 acquire-release semantics: locking a mutex is an acquire operation, and unlocking the mutex is a release operation.

使用 acquire-release semantics 的场景: if you use acquire and release orderings on atomic variables to build a simple lock, then from the point of view of code that uses the lock, the behavior will appear sequentially consistent, even though the internal operations are not(这部分与 sequentially consistent ordering 不同, 因此可以按情况使用榨干效能).

DATA DEPENDENCY WITH ACQUIRE-RELEASE ORDERING AND MEMORY_ORDER_CONSUME

C++17 standard explicitly recommends that you do not use it.

It’s all about data dependencies. There are two new relations that deal with data dependencies: dependency-ordered-before and carries-a-dependency-to.

carries-a-dependency-to applies strictly within a single thread and models the data dependency between operations. This operation is also transitive.
dependency-ordered-before relationship can apply between threads. It’s introduced by using atomic load operations tagged with memory_order_consume.
- This is a special case of memory_order_acquire that limits the synchronized data to direct dependencies; a store operation (A) tagged with memory_order_release, memory_order_acq_rel, or memory_order_seq_cst is dependency-ordered-before a load operation (B) tagged with memory_order_consume if the consume reads the value stored.
- Also transitive.
- 应用场景: where the atomic operation loads a pointer to some data. By using memory_order_consume on the load and memory_order_release on the prior store, you ensure that the pointed-to data is correctly synchronized, without imposing any synchronization requirements on any other nondependent data.

#include <string>
#include <thread>
#include <atomic>
#include <assert.h>
struct X
{
    int i;
    std::string s;
};

std::atomic<X*> p;
std::atomic<int> a;

void create_x()
{
    X* x=new X;
    x->i=42;
    x->s="hello";
    a.store(99,std::memory_order_relaxed);
    p.store(x,std::memory_order_release);
}

void use_x()
{
    X* x;
    while(!(x=p.load(std::memory_order_consume)))
        std::this_thread::sleep_for(std::chrono::microseconds(1));
    assert(x->i==42);
    assert(x->s=="hello");
    assert(a.load(std::memory_order_relaxed)==99);
}
int main()
{
    std::thread t1(create_x);
    std::thread t2(use_x);
    t1.join();
    t2.join();
}

上面的例子中 assert(x->i==42); assert(x->s=="hello"); 肯定不会 fire, 但是 assert(a.load(std::memory_order_relaxed)==99); 有可能 fire. 因为 a 并没有与 p 有依赖关系, 因此可以被 reordered.

如何打破 carrying the dependency(chain)

std::kill_dependency(): a simple function template that copies the supplied argument to the return value but breaks the dependency chain in doing so.
In real code, you should always use memory_order_acquire where you might be tempted to use memory_order_consume, and std::kill_dependency is unnecessary.

5.3.4 Release sequences and synchronizes-with

the chain of operations constitutes a release sequence and the initial store synchronizes with (for memory_order_acquire or memory_order_seq_cst) or is dependency-ordered-before (for memory_order_consume) the final load.

#include <atomic>
#include <thread>
#include <vector>
std::vector<int> queue_data;
std::atomic<int> count;

void wait_for_more_items() {}
void process(int data){}

void populate_queue()
{
    unsigned const number_of_items=20;
    queue_data.clear();
    for(unsigned i=0;i<number_of_items;++i)
    {
        queue_data.push_back(i);
    }
    
    count.store(number_of_items,std::memory_order_release);
}

void consume_queue_items()
{
    while(true)
    {
        int item_index;
        if((item_index=count.fetch_sub(1,std::memory_order_acquire))<=0)
        {
            wait_for_more_items();
            continue;
        }
        process(queue_data[item_index-1]);
    }
}

int main()
{
    std::thread a(populate_queue);
    std::thread b(consume_queue_items);
    std::thread c(consume_queue_items);
    a.join();
    b.join();
    c.join();
}

图中的实线为 happens-before 关系, 虚线为 release sequence.

存在 release sequence 的意义是 thread a b 之间不会形成对 count.store() data race, RMW operation 以及 acquire-release semantics 保证了 release sequence 的实现.

PS. 这里的微观基础可能是 RMW 操作的基础是类似与 compare-and-swap(CAS) 的操作, 这样就可以实现 no data race. 当然实现的微观基础也有可能是 semaphore 来避免 starvation.

5.3.5 Fences

Fences are also commonly called memory barriers, and they get their name because they put a line in the code that certain operations can’t cross.

实际意义: relaxed operations on separate variables can usually be freely reordered by the compiler or the hardware. Fences restrict this freedom and introduce happens-before and synchronizes-with relationships that weren’t present before.

#include <atomic>
#include <thread>
#include <assert.h>

std::atomic<bool> x,y;
std::atomic<int> z;

void write_x_then_y()
{
    x.store(true,std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_release);
    y.store(true,std::memory_order_relaxed);
}

void read_y_then_x()
{
    while(!y.load(std::memory_order_relaxed));
    std::atomic_thread_fence(std::memory_order_acquire);
    if(x.load(std::memory_order_relaxed))
        ++z;
}

int main()
{
    x=false;
    y=false;
    z=0;
    std::thread a(write_x_then_y);
    std::thread b(read_y_then_x);
    a.join();
    b.join();
    assert(z.load()!=0);
}

添加的 std::atomic_thread_fence(std::memory_order_release); 与 std::atomic_thread_fence(std::memory_order_acquire); 可以像对 atomic variables 进行 acquire-release 操作一样实现对顺序的限制.

it’s important to note that the synchronization point is the fence itself. 注意 fence 的放置位置!

5.3.6 Ordering non-atomic operations with atomics

#include <atomic>
#include <thread>
#include <assert.h>

bool x=false;
std::atomic<bool> y;
std::atomic<int> z;

void write_x_then_y()
{
    x=true; //not atomic
    std::atomic_thread_fence(std::memory_order_release);
    y.store(true,std::memory_order_relaxed);
}

void read_y_then_x()
{
    while(!y.load(std::memory_order_relaxed));
    std::atomic_thread_fence(std::memory_order_acquire);
    if(x)
        ++z;
}

int main()
{
    x=false;
    y=false;
    z=0;
    std::thread a(write_x_then_y);
    std::thread b(read_y_then_x);
    a.join();
    b.join();
    assert(z.load()!=0);
}

有了 fence, 即便不是 atomic 类型数据也可以被限制顺序.

5.3.7 Ordering non-atomic operations

atomic 与 non-atomic 语句混合的情况下的顺序: If a non-atomic operation is sequenced before an atomic operation, and that atomic operation happens before an operation in another thread, the non-atomic operation also happens before that operation in the other thread.

这样就把中层一点的同步实现与顶层的用户接口的逻辑对接起来了: Each of the synchronization mechanisms described in chapters 2, 3, and 4 will provide ordering guarantees in terms of the synchronizes-with relationship.

下面的罗列可能有点枯燥, 但是可以理清很多我们认为是理所当然但其实可以细扣的顺序问题(也相当于一次汇总复习了).

std::thread
- The completion of the std::thread constructor synchronizes with the invocation of the supplied function or callable object on the new thread.
- The completion of a thread synchronizes with the return from a successful call to join on the std::thread object that owns that thread.
std::mutex, std::timed_mutex, std::recursive_mutex, std::recursive_timed_mutex
- All calls to lock and unlock, and successful calls to try_lock, try_lock_for, or try_lock_until, on a given mutex object form a single total order: the lock order of the mutex.
- A call to unlock on a given mutex object synchronizes with a subsequent call to lock, or a subsequent successful call to try_lock, try_lock_for, or try_lock_until, on that object in the lock order of the mutex.
- Failed calls to try_lock, try_lock_for, or try_lock_until do not participate in any synchronization relationships.
std::shared_mutex, std::shared_timed_mutex
- All calls to lock, unlock, lock_shared, and unlock_shared, and successful calls to try_lock, try_lock_for, try_lock_until, try_lock_shared, try_lock_shared_for, or try_lock_shared_until, on a given mutex object form a single total order: the lock order of the mutex.
- A call to unlock on a given mutex object synchronizes with a subsequent call to lock or shared_lock, or a successful call to try_lock, try_lock_for, try_lock_until, try_lock_shared, try_lock_shared_for, or try_lock_shared_until, on that object in the lock order of the mutex.
- Failed calls to try_lock, try_lock_for, try_lock_until, try_lock_shared, try_lock_shared_for, or try_lock_shared_until do not participate in any synchronization relationships.
std::promise, std::future AND std::shared_future
- The successful completion of a call to set_value or set_exception on a given std::promise object synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the promise.
- The destructor of a given std::promise object that stores an std::future_error exception in the shared asynchronous state associated with the promise synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the promise.
std::packaged_task, std::future AND std::shared_future
- The successful completion of a call to the function call operator of a given std::packaged_task object synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the packaged task.
- The destructor of a given std::packaged_task object that stores an std::future_error exception in the shared asynchronous state associated with the packaged task synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the packaged task.
std::async, std::future AND std::shared_future
- The completion of the thread running a task launched via a call to std::async with a policy of std::launch::async synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the spawned task.
- The completion of a task launched via a call to std::async with a policy of std::launch::deferred synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the promise.
std::experimental::future, std::experimental::shared_future AND CONTINUATIONS
- The event that causes an asynchronous shared state to become ready synchronizes with the invocation of a continuation function scheduled on that shared state.
- The completion of a continuation function synchronizes with a successful return from a call to wait or get, or a call to wait_for or wait_until that returns std::future_status::ready on a future that shares the same asynchronous state as the future returned from the call to then that scheduled the continuation, or the invocation of any continuation scheduled on that future.
std::experimental::latch
- The invocation of each call to count_down or count_down_and_wait on a given instance of std::experimental::latch synchronizes with the completion of each successful call to wait or count_down_and_wait on that latch.
std::experimental::barrier
- The invocation of each call to arrive_and_wait or arrive_and_drop on a given instance of std::experimental::barrier synchronizes with the completion of each subsequent successful call to arrive_and_wait on that barrier.
std::experimental::flex_barrier
- The invocation of each call to arrive_and_wait or arrive_and_drop on a given instance of std::experimental::flex_barrier synchronizes with the completion of each subsequent successful call to arrive_and_wait on that barrier.
- The invocation of each call to arrive_and_wait or arrive_and_drop on a given instance of std::experimental::flex_barrier synchronizes with the subsequent invocation of the completion function on that barrier.
- The return from the completion function on a given instance of std::experimental::flex_barrier synchronizes with the completion of each call to arrive_and_wait on that barrier that was blocked waiting for that barrier when the completion function was invoked.
std::condition_variable AND std::condition_variable_any
- Condition variables do not provide any synchronization relationships. They are optimizations over busy-wait loops, and all the synchronization is provided by the operations on the associated mutex.

C++并发实战-第五章-总学习笔记第3

https://www.chuxin911.com/Notes_On_C++_Concurrency_in_Action_3_20221001/

作者

发布于

2022-10-01

更新于

2022-10-25

许可协议

#C++Learning_Note Parallel_Programming

C++并发实战-第五章-总学习笔记第3

Chapter 5 The C++ memory model and operations on atomic types

5.1 Memory model basics

5.1.1 Objects and memory locations

5.1.2 Objects, memory locations, and concurrency

5.1.3 Modification orders

5.2 Atomic operations and types in C++

5.2.1 The standard atomic types

5.2.2 Operations on std::atomic_flag

特点

应用: 自旋锁

5.2.3 Operations on std::atomic<bool>

STORING A NEW VALUE (OR NOT) DEPENDING ON THE CURRENT VALUE

5.2.4 Operations on std::atomic<T*>: pointer arithmetic

5.2.5 Operations on standard atomic integral types

5.2.6 The std::atomic<> primary class template

5.2.7 Free functions for atomic operations

5.3 Synchronizing operations and enforcing ordering

5.3.1 The synchronizes-with relationship

5.3.2 The happens-before relationship

5.3.3 Memory ordering for atomic operations

SEQUENTIALLY CONSISTENT ORDERING

NON-SEQUENTIALLY CONSISTENT MEMORY ORDERINGS

RELAXED ORDERING

UNDERSTANDING RELAXED ORDERING

ACQUIRE-RELEASE ORDERING

TRANSITIVE SYNCHRONIZATION WITH ACQUIRE-RELEASE ORDERING

DATA DEPENDENCY WITH ACQUIRE-RELEASE ORDERING AND MEMORY_ORDER_CONSUME

5.3.4 Release sequences and synchronizes-with

5.3.5 Fences

5.3.6 Ordering non-atomic operations with atomics

5.3.7 Ordering non-atomic operations

作者

发布于

更新于

许可协议

目录

分类

最新文章

目录

标签

订阅更新

5.2.2 Operations on `std::atomic_flag`

5.2.3 Operations on `std::atomic<bool>`

5.2.4 Operations on `std::atomic<T*>`: pointer arithmetic

5.2.6 The `std::atomic<>` primary class template