例如 <linux/list.h> 的 list_for_each() macro:
#define list_for_each(pos, head) \
for (pos = (head)->next; prefetch(pos->next), pos != (head); \
pos = pos->next)
經過實證反而效能較差 (短 list、null prefetch),硬體自己做的不會較差。其它:
- reordering structures that commonly accessed together fields are found in the same cache line
- linked-list => cache-unfriendly
- singly-linked hlist hash table list
- likely()
The problem with prefetch
沒有留言:
張貼留言