2022年6月25日 星期六

User space Atomic Operations

Atomic Operation (原子運算?) 的指令執行不會因其它指令插入而導致錯誤結果。

Linux Kernel 的 atomic.h 適用在 user space 嗎?

GCC Atomic builtins GCC v4.1 以上。

  • 相容 Intel Itanium ABI section 7.4,不用一般 GCC 習慣前置「__builtin_」,更進一步 overloaded 可用在長度 1、2、4、或 8 bytes。
  • 有些 target processors 不支援有些 operations, 會產生 warning 並呼叫額外 suffix 「_n」的外部函數,n 是長度。
  • ???在大多情況,這些函數被視為「full barrier」,也就是 no memory operand will be moved across the operation, either forward or backward. Further, instructions will be issued as necessary to prevent the processor from speculating loads across the operation and from queuing stores after the operation.
  • ???All of the routines are are described in the Intel documentation to take “an optional list of variables protected by the memory barrier”. It's not clear what is meant by that; it could mean that only the following variables are protected, or it could mean that these variables should in addition be protected. At present GCC ignores this list and protects all variables which are globally accessible. If in the future we make some use of this list, an empty list will continue to mean all globally accessible variables.
  • 對值 add/sub/or/and/xor/nand 並取回原本的值
    type __sync_fetch_and_add (type *ptr, type value, ...)
    type __sync_fetch_and_sub (type *ptr, type value, ...)
    type __sync_fetch_and_or (type *ptr, type value, ...)
    type __sync_fetch_and_and (type *ptr, type value, ...)
    type __sync_fetch_and_xor (type *ptr, type value, ...)
    type __sync_fetch_and_nand (type *ptr, type value, ...)
    相當於
    { tmp = *ptr; *ptr op= value; return tmp; }
    { tmp = *ptr; *ptr = ~tmp & value; return tmp; } // nand
  • 對值 add/sub/or/and/xor/nand 並回傳新值
    type __sync_add_and_fetch (type *ptr, type value, ...)
    type __sync_sub_and_fetch (type *ptr, type value, ...)
    type __sync_or_and_fetch (type *ptr, type value, ...)
    type __sync_and_and_fetch (type *ptr, type value, ...)
    type __sync_xor_and_fetch (type *ptr, type value, ...)
    type __sync_nand_and_fetch (type *ptr, type value, ...)
    相當於
    { *ptr op= value; return *ptr; }
    { *ptr = ~*ptr & value; return *ptr; } // nand
  • compare and swap:*ptr 比較 oldval 相同則寫 newval 到 *ptr
    bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)
    type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)
    “bool” 版本回傳比較是否成功。“val” 版本回傳原本的 *ptr。
  • full memory barrier
    __sync_synchronize (...)
  • 並非傳統的 test-and-set,而是 exchange:寫 value 到 *ptr,回傳舊的 *ptr.
    type __sync_lock_test_and_set (type *ptr, type value, ...)
    許多 targets 只有部份支援. In this case, a target may support reduced functionality here by which the only valid value to store is the immediate constant 1. The exact value actually stored in *ptr is implementation defined.
    不是 full barrier,而是 acquire barrier. This means that references after the builtin cannot move to (or be speculated to) before the builtin, but previous memory stores may not be globally visible yet, and previous memory loads may not yet be satisfied.
  • 釋出 __sync_lock_test_and_set 取得的 lock,一般意思是 *ptr 寫 0
    void __sync_lock_release (type *ptr, ...)
    不是 full barrier,而是 release barrier. This means that all previous memory stores are globally visible, and all previous memory loads have been satisfied, but following memory reads are not prevented from being speculated to before the barrier.

https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

簡單型別,可以使用 atomic_xxx (如:atomic_int)

C++11、C11 將低階的硬體指令變成標準函數,如 atomic_fetch_add

Comparing the performance of atomic, spinlock and mutex》比較執行一千萬次的時間:直接存取 0.070,Atomic 0.457,LOCK 0.481,Spinlock 0.541,Mutex 22.667。

參考:

  1. https://the-linux-channel.the-toffee-project.org/index.php?page=6-tutorials-linux-user-space-atomic-operations&lang=en
  2. https://wirelessr.gitbooks.io/working-life/content/atomic_variable_in_user_space.html

Concurrency 和 Parallelism。Concurrency (並行) 是多個獨立的工作同時進行。Parallelism (平行) 將並行的工作分配給不同硬體單元進行,真的同時做。Concurrency 可將單一硬體分時共享達成,也可以用 Parallelism 實現。

沒有留言:

張貼留言

SIP header Via

所有 SIP 訊息 都要有 Via,縮寫 v。一開始的 UAC 和後續途經的每個 proxy 都會疊加一個 Via 放傳送的位址,依序作為回應的路徑。 格式 sent-protocol sent-by [ ;branch= branch ][ ; 參數 ...] s...