C是面向过程的语言，但是linux内核却用C实现了一套面向对象的设计模式，linux内核中处处体现着面向对象的思想。

1. 内核链表和list_entry

1.1 普通链表实现

我们在语法书上学到的链表都是在原数据结构的基础上增加指针域next(或者prev)，从而使各个节点能链接在一起。
比如

typedef struct node
{    int data;/*...*/struct node    *prev;struct node   *next;
} Node;

1.2 linux内核链表实现

传统的链表有个最大的缺点就是不好通用化，linux内核的实现可以说是独树一帜，它实现了一种更通用的链表。

struct list_head
{struct list_head    *prev;struct list_head    *next;
};
typedef struct node
{int data;/*...*/struct list_head    list;
}Node;

在数据结构中塞入list_head，那么当组成链表的时候，所有的Node节点的list域串联在一起组成链表，我们一次遍历其实就是遍历所有的list_head域。

1.3 list_entry通过成员获取整个对象

这里有一个小技巧：当我们知道了一个结构体对象Node中某个成员（比如list_head成员list）的地址，通过偏移就可以计算出整个结构体对象起始位置的地址。

linux内核提供了list_entry的宏，来通过成员对象指针来获取到整个对象的指针。

这到底是怎么实现的呢？

#define list_entry(ptr, type, member) /container_of(ptr, type, member)#define container_of(ptr, type, member)                 /
({                                                        /const typeof( ((type *)0)->member ) *__mptr = (ptr);/(type *)( (char *)__mptr - offsetof(type,member) ); /
})#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

我们将宏完全展开，就得到如下的代码：

#define list_entry(ptr, type, member) /((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))

ptr是指向list_head类型链表的指针
type为一个结构
而member为结构type中的一个域，类型为list_head

这个宏返回指向type结构的指针。

(char *)(ptr)使得指针的加减操作步长为一字节
(unsigned long)(&((type *)0)->member)等于ptr指向的member到该member所在结构体基地址的偏移字节数。
二者一减便得出该结构体的地址。

2. 内核队列-kfifo

生产者和消费者模型中，生产者创造数据，而消费者读取消息和处理包，或者以其他方式消费这些数据。

最简单的实现方法是队列，生产者将数据推进队列，消费者从队列中读取数据。消费者获取数据的顺序和生产者推入队列的顺序一致。

队列，即FIFO（First in first out，先进先出）。Linux内核通用队列实现称为kfifo，在kernel/kfifo.c文件中实现，声明在<linux/kfifo.h>文件中。

2.1 kfifo实现

首先看一下kfifo的数据结构：

struct kfifo {unsigned char *buffer;     /* the buffer holding the data */unsigned int size;         /* the size of the allocated buffer */unsigned int in;           /* data is added at offset (in % size) */unsigned int out;          /* data is extracted from off. (out % size) */spinlock_t *lock;          /* protects concurrent modifications */
};

定义自旋锁的目的为了防止多进程/线程并发使用kfifo。因为in和out在每次get和out时，发生改变。

2.2 ring buffer

kfifo的巧妙之处在于in和out定义为无符号类型，在put和get时，in和out都是增加，当达到最大值时，产生溢出，使得从0开始，进行循环使用。

初始化
put

3. get

put超出末尾，移到前面

代码如下：

static inline unsigned int kfifo_put(struct kfifo *fifo,const unsigned char *buffer, unsigned int len)
{unsigned long flags;unsigned int ret;spin_lock_irqsave(fifo->lock, flags);ret = __kfifo_put(fifo, buffer, len);spin_unlock_irqrestore(fifo->lock, flags);return ret;
}static inline unsigned int kfifo_get(struct kfifo *fifo,unsigned char *buffer, unsigned int len)
{unsigned long flags;unsigned int ret;spin_lock_irqsave(fifo->lock, flags);ret = __kfifo_get(fifo, buffer, len);//当fifo->in == fifo->out时，buufer为空if (fifo->in == fifo->out)fifo->in = fifo->out = 0;spin_unlock_irqrestore(fifo->lock, flags);return ret;
}unsigned int __kfifo_put(struct kfifo *fifo,const unsigned char *buffer, unsigned int len)
{unsigned int l;//buffer中空的长度len = min(len, fifo->size - fifo->in + fifo->out);/** Ensure that we sample the fifo->out index -before- we* start putting bytes into the kfifo.*/smp_mb();/* first put the data starting from fifo->in to buffer end */l = min(len, fifo->size - (fifo->in & (fifo->size - 1)));memcpy(fifo->buffer + (fifo->in & (fifo->size - 1)), buffer, l);/* then put the rest (if any) at the beginning of the buffer */memcpy(fifo->buffer, buffer + l, len - l);/** Ensure that we add the bytes to the kfifo -before-* we update the fifo->in index.*/smp_wmb();fifo->in += len;  //每次累加，到达最大值后溢出，自动转为0return len;
}unsigned int __kfifo_get(struct kfifo *fifo,unsigned char *buffer, unsigned int len)
{unsigned int l;//有数据的缓冲区的长度len = min(len, fifo->in - fifo->out);/** Ensure that we sample the fifo->in index -before- we* start removing bytes from the kfifo.*/smp_rmb();/* first get the data from fifo->out until the end of the buffer */l = min(len, fifo->size - (fifo->out & (fifo->size - 1)));memcpy(buffer, fifo->buffer + (fifo->out & (fifo->size - 1)), l);/* then get the rest (if any) from the beginning of the buffer */memcpy(buffer + l, fifo->buffer, len - l);/** Ensure that we remove the bytes from the kfifo -before-* we update the fifo->out index.*/smp_mb();fifo->out += len; //每次累加，到达最大值后溢出，自动转为0return len;
}

3. 内核哈希

hash 最重要的是选择适当的hash函数，从而平均的分配关键字在桶中的位置，从而优化查找插入和删除所用的时间。

然而任何hash函数都会出现冲突问题。内核采用的解决哈希冲突的方法是：拉链法，拉链法解决冲突的做法是：将所有关键字为同义词的结点链接在同一个链表中。

3.1 linux 内核哈希实现

内核哈希数据结构：

struct hlist_head {struct hlist_node *first;
};struct hlist_node {struct hlist_node *next, **pprev;
};

hlist_head表示哈希表的头结点。哈希表中每一个entry(list_entry)所对应的都是一个链表（hlist）.hlist_head结构体只有一个域，即first。First指针指向该hlist链表的第一个结点。
hlist_node结构体有两个域，next和pprev。(1)next指向下个hlist_node结点，倘若改结点是链表的最后一个节点，next则指向NULL。(2)pprev是一个二级指针，它指向前一个节点的next指针。

3.2 为什么需要一个专门的哈希表头？

因为哈希链表并不需要双向循环的技能，它一般适用于单向散列的场景。

所以，为了减少开销，并没有用struct hlist_node{}来代表哈希表头，而是重新设计struct hlist_head{}这个数据结构。此时，一个哈希表头就只需要4Byte了，相比于struct hlist_node{}来说，存储空间已经减少了一半。

这样一来，在需要大量用到哈希链表的场景，其存储空间的节约是非常明显的，特别是在嵌入式设备领域。

3.3 使用pprev二级指针的意义？

在hlist中，表头中没有prev，只有一个first，为了能统一地修改表头的first指针hlist就设计了pprev。

node节点里的pprev其实指向的是其前一个节点里的第一个指针元素的地址。对于hlist_head来说，它里面只有一个指针元素，就是first指针；而对于hlist_node来说，第一个指针元素就是next。

所以，当在代码中看到类似与 *(hlist_node->pprev) 这样的代码时，表示此时正在哈希表里操作当前节点前一个节点里的第一个指针元素所指向的内存地址。

3.4 hash函数选择

教课书上的hash函数一般都是对模数取余,模数一般就是hash表的长度(桶的个数),通常为了较好的散列性,还把模数调整为一个质数.

那么内核中的hash函数是什么样呢?

其实不同的使用场景，需要用到的hash函数是不同的。

比如linux netfilter中对会话的hash 函数

static u_int32_t __hash_conntrack(const struct nf_conntrack_tuple *tuple,unsigned int size, unsigned int rnd){unsigned int a, b;a = jhash((void *)tuple->src.u3.all, sizeof(tuple->src.u3.all),((tuple->src.l3num) << 16) | tuple->dst.protonum);b = jhash((void *)tuple->dst.u3.all, sizeof(tuple->dst.u3.all),(tuple->src.u.all << 16) | tuple->dst.u.all);return jhash_2words(a, b, rnd) % size;
}static inline u_int32_t hash_conntrack(const struct nf_conntrack_tuple *tuple)
{return __hash_conntrack(tuple, nf_conntrack_htable_size,nf_conntrack_hash_rnd);
}

linux内核数据结构实现--链表、队列和哈希相关推荐

linux内核数据结构之链表
1.前言最近写代码需用到链表结构,正好公共库有关于链表的.第一眼看时,觉得有点新鲜,和我之前见到的链表结构不一样,只有前驱和后继指针,而没有数据域.后来看代码注释发现该代码来自linux内核,在li ...
Linux内核数据结构——链表
目录目录简介单向链表双向链表环形链表 Linux内核中的链表实现 offsetof container_of container_of 第一部分 container_of 第二部分链表初始 ...
go移植linux内核书名叫啥,Go语言移植Linux内核数据结构hlist
hlist(哈希链表)可以通过相应的Hash算法,迅速找到相关的链表Head及节点. 在有些应用场景,比Go标准库提供的list(一种双向链表)更合适. 依照list.h中的源码,我实现了一个Go语言 ...
如何放出Linux内核中的链表大招
前言上回,我们说到Linux内核中max()宏的终极奥义,Linux内核链表也不甘示弱,那么接下来,让我们看看Linux内核中的链表大招. 如何放出Linux内核中的链表大招前言一.链表简介 ( ...
【Linux】Linux内核数据结构：IDR（redix树）
1. 引言最近在系统里遇到了IDR结构体,后来看了一下,是内核的一个基础结构. 这个是怎么引入的,引入是为了什么呢? 最早的时候,我们的结构体是一个类似于大结构体套小结构体. struct A {i ...
FIFO - linux内核数据结构
Linux kernal 鬼斧神工,博大精深,让人叹为观止,拍手叫绝.然匠心独运的设计并非扑朔迷离.盘根错节,真正的匠心独运乃辞简理博.化繁为简,在简洁中昭显优雅和智慧,kfifo就是这样一种数据结构 ...
linux内核 list.h,Linux内核list.h链表分析
Linux Kernel中,常常需要使用双向链表.在~/include/linux/list.h中,就定义了双向链表和常用的function. 链表头如下: struct list_head { st ...
Linux 内核数据结构解析--无锁链表
无锁链表定义在include/linux/llist.h.无锁链表是一种单向链表定义: struct llist_head {struct llist_node *first; };struct l ...
linux 内核数据结构 file_operations、file、inode
文件操作结构将驱动程序操作连接到设备编号,结构定义在<linux/fs.h>,其中包含一组函数指针,每个打开的文件(在内部由一个file结构表示)和一组函数关联(通过包含指向一个file ...

linux内核数据结构实现--链表、队列和哈希