二叉搜索树具有对数平均时间的表现，但是这个需要满足的假设前提是输入的数据需要具备随机性
hashtable 散列表这种结构在插入、删除、搜寻等操作层面上也具有常数平均时间的表现。而且不需要依赖元素的随机性，这种表现是以统计为基础的

hashtable的概述

hashtable可提供对任何有名项的存取和删除操作
因为操作的对象是有名项，因此hashtable可以作为一种字典结构
将一个元素映射成为一个 “大小可以接受的索引”简称为hash function散列函数
考虑到元素的个数大于array的容量，可能有不同的元素被映射到相同的位置，简称为碰撞
解决碰撞的方法有很多，线性探测、二次探测、开链

线性探测

负载系数：元素的个数除以表格的大小，负载系数介于0-1，除非使用开链法
使用线性探测时，根据散列函数计算得到的位置已经存在了元素，就需要循环往下一一寻找，如果到达array的尾端，就需要绕回到头部继续寻找，直到找到一个可用的空间为止。
元素的搜寻也是类似
元素的删除采用惰性机制，只标记删除的记号，实际真正的删除操作需要等待表格重新整理时才可以进行

需要两个假设：1，表格足够大；2，每个元素都够独立 (如果所有元素通过散列函数计算都得到相同的位置，造成了平均插入成本的厂长速度远远高于了负载系数的成长速度)

二次探测

F(i) = i^2,如果计算得到新元素的位置是H，但是这个位置已经被占用了，将会依序尝试 H+1^2 H+2^2 H+3^2 等等，而不是H+1 H+2

如果将表格的大小设定为质数，保持负载系数低于0.5，那么没插入一个元素所需要的探测次数不多于 2

开链

每一个表格元素维护一个list，然后对list进行元素的插入删除等操作
hashtable使用开链法

hashtable的桶子和节点

hashtable表格内的元素为桶子，名称的含义是表格内的每个单元涵盖的不只是个节点，甚至是一桶节点

template <class Value>
struct __hashtable_node{__hashtable_node* next;Value val;
};

bucket使用的linked list，不是采用stl源码中的list slist ，而是自行维护上述的hash table node
buckets聚合体则以vector完成，从而具备了扩充的能力

hashtable迭代器

hashable迭代器维持着与整个buckets vector的关系，并记录目前所指的节点
前进操作是从目前节点出发前进一个位置，由于节点被安置于list内，使用next进行前进操作
如果目前是list的尾端，则跳转至下一个bucket上，正是指向下一个list的头部
一篇足矣，带你吃透STL源码中hash table(哈希表)与关联式容器hash_set、hash_map_董哥的黑板报-CSDN博客
hashtable的迭代器没有后退操作，hashtable没有定义所谓的逆向迭代器

hashtable的数据结构

buckets聚合体以vector完成，以利动态扩充
<stl_hash_fun.h>定义数个现成的hash functions 全都是仿函数，hash function计算单元的位置，也就是元素对应的bucket的位置，具体调用的函数是bkt_num(),它调用hash function取得一个可以执行modulus(取模)运算的数值
按照质数设计vector的大小，事先准备好28个质数，并设计一个函数用于查询最接近某数并大于某数的质数

hashtable的构造和内存管理

vector的reserve的使用（避免内存重新分配以及内存分配的方式）_Zero's Zone-CSDN博客

判断元素落在哪一个bucket内？这是hash function的任务，但是SGI STL对其进行了封装先交给bkt_num()函数再由此函数调用hash function，得到一个可以执行的modules(取模)运算的数值
以上的目的是出于有些元素的型别是无法直接对其进行取模运算的，比如字符串类型

    //版本1：接受实值（value）和buckets个数size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //调用版本4}//版本2：只接受实值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //调用版本3}//版本3，只接受键值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //调用版本4}//版本4：接受键值和buckets个数size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有内建的hash()，在后面的hash functions中介绍}

复制和整体删除

hash table是由vector和linked list组合而成的，因此复制和整体删除都需要注意内存的释放的问题

    void clear(){//针对每一个bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//删除bucket list中的每一个节点while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets内容为null}num_elements = 0; //令总的节点的个数为0//需要注意 buckets vector并没有释放空间，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是调用vector::clear() 造成所有的元素都为0buckets.clear();//为己方的buckets vector保留空间，使与对方相同//如果己方的空间大于对方 就不需要改变；如果己方的空间小于对方 就会增大buckets.reserve(ht.buckets.size());//从己方的buckets vector尾端开始，插入n个元素，其数值为 null 指针//注意此时buckets vector为空，所谓的尾端就是起头处buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//针对buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//复制vector的每一个元素(是一个指针，指向hashtable节点)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//针对同一个 buckets list 复制每一个节点for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登录的节点的个数(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}

整体代码

#include <iostream>
#include <vector>#ifdef __STL_USE_EXCEPTIONS
#define __STL_TRY   try
#define __STL_UNWIND(action)   catch(...) { action; throw; }
#else
#define __STL_TRY
#define __STL_UNWIND(action)
#endiftemplate<class T,class Alloc>
class simple_alloc{
public:static T* allocate(std::size_t n){return 0==n?0:(T*)Alloc::allocate(n * sizeof(T));}static T* allocate(void){return (T*)Alloc::allocate(sizeof (T));}static void deallocate(T* p,size_t n){if (n!=0){Alloc::deallocate(p,n * sizeof(T));}}static void deallocate(T* p){Alloc::deallocate(p,sizeof(T));}
};namespace Chy{template <class T>inline T* _allocate(ptrdiff_t size,T*){std::set_new_handler(0);T* tmp = (T*)(::operator new((std::size_t)(size * sizeof (T))));if (tmp == 0){std::cerr << "out of memory" << std::endl;exit(1);}return tmp;}template<class T>inline void _deallocate(T* buffer){::operator delete (buffer);}template<class T1,class T2>inline void _construct(T1 *p,const T2& value){new(p) T1 (value);  //没看懂}template <class T>inline void _destroy(T* ptr){ptr->~T();}template <class T>class allocator{public:typedef T           value_type;typedef T*          pointer;typedef const T*    const_pointer;typedef T&          reference;typedef const T&    const_reference;typedef std::size_t size_type;typedef ptrdiff_t   difference_type;template<class U>struct rebind{typedef allocator<U>other;};pointer allocate(size_type n,const void * hint = 0){return _allocate((difference_type)n,(pointer)0);}void deallocate(pointer p,size_type n){_deallocate(p);}void construct(pointer p,const T& value){_construct(p,value);}void destroy(pointer p){_destroy(p);}pointer address(reference x){return (pointer)&x;}const_pointer const_address(const_reference x){return (const_pointer)&x;}size_type max_size()const{return size_type(UINT_MAX/sizeof (T));}};
}template <class Value>
struct __hashtable_node{__hashtable_node* next;Value val;
};
/** Key:         节点的实值类型* Value:       节点的键值类型* HashFun:     hash function的函数型别* ExtractKey:  从节点中提取键值的方法 (函数或者仿函数)* EqualKey:    判断键值是否相同 (函数或者仿函数)* Alloc:       空间配置器 缺省使用 std::alloc*/template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc>
class hashtable{
public:typedef Key key_type;typedef Value value_type;typedef HashFcn hasher;    //为template型别参数重新定义一个名称typedef EqualKey key_equal;//为template型别参数重新定义一个名称typedef std::size_t size_type;typedef ptrdiff_t difference_type;private://以下三者都是function objects//<stl_hash_fun.h> 定义有数个标准型别(如 int、c-style、string等)的hasherhasher hash;        //散列函数key_equal equals;   //判断键值是否相等ExtractKey get_key; //从节点取出键值typedef __hashtable_node<Value>node;//专属的节点配置器typedef simple_alloc<node,Alloc>node_allocator;//节点的配置函数node* new_node(const value_type& obj){node* n = node_allocator::allocate();n->next = 0;__STL_TRY{Chy::allocator<Key>::construct(&n->val,obj);return n;};__STL_UNWIND(node_allocator::deallocate(n);)}//节点释放函数void delete_node(node* n){Chy::allocator<Key>::destroy(n->val);node_allocator::deallocate(n);}public:std::vector<node*,Alloc>buckets;//以vector完成桶的集合，其实值是一个node*size_type num_elements;  //node的个数
public://bucket个数 即buckets vector的大小size_type bucket_count() const{return buckets.size();}//注意假设 假设long至少有32bitstatic const int __stl_num_primes = 28;constexpr static const unsigned long __stl_prime_list[__stl_num_primes] ={53,         97,         193,       389,       769,1543,       3079,       6151,      12289,     24593,49157,      98317,      196613,    393241,    786433,1572869,    3145739,    6291469,   12582917,  25165843,50331653,   100663319,  201326611, 402653189, 805306457,1610612741, 3221225473, 4294967291};//找出上述28指数中，最接近并大于n的那个质数inline unsigned long __stl_next_prime(unsigned long n){const unsigned long *first = __stl_prime_list;const unsigned long *last = __stl_prime_list + __stl_num_primes;const unsigned long *pos = std::lower_bound(first,last,n);//使用lower_bound() 需要先进行排序return pos == last ? *(last-1) : *pos;}//总共有多少个buckets。以下是hash_table的一个member functionsize_type max_bucket_count()const{//其数值将为 4294967291return __stl_prime_list[__stl_num_primes - 1];}//构造函数hashtable(size_type n,const HashFcn& hf,const EqualKey& eql):hash(hf),equals(eql),get_key(ExtractKey()),num_elements(0){initialize_buckets(n);}//初始化函数void initialize_buckets(size_type n){//例子：传入50 返回53//然后保留53个元素的空间 然后将其全部填充为0const size_type n_buckets = next_size(n);buckets.reserve(n_buckets);//设定所有的buckets的初值为0(node*)buckets.insert(buckets.begin(),n_buckets,(node*)0);}public://版本1：接受实值（value）和buckets个数size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //调用版本4}//版本2：只接受实值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //调用版本3}//版本3，只接受键值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //调用版本4}//版本4：接受键值和buckets个数size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有内建的hash()，在后面的hash functions中介绍}public://相关对应的函数//next_size()返回最接近n并大于n的质数size_type next_size(size_type n) const {return __stl_next_prime(n);}typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;//插入操作和表格重整//插入元素不允许重复std::pair<iterator,bool>insert_unique(const value_type& obj){//判断是否需要重建表格  如果需要就进行扩充resize(num_elements + 1);return insert_unique_noresize(obj);}//函数判断是否需要重建表格 如果不需要立刻返回，如果需要 就重建表格void resize(size_type num_elements_hint){//表格重建与否的原则是：元素的个数(新增元素计入之后)和先前分配的bucket vector进行比较//如果前者的大于后者 就需要表格的重建//因此 bucket(list)的最大容量和buckets vector的大小相同const size_type old_n = buckets.size();if (old_n < num_elements_hint){//需要重新分配内存//计算下一个质数const size_type n = next_size(num_elements_hint);if (n > old_n){std::vector<node*,Alloc>tmp(n,(node*)0);__STL_TRY{//处理每一个旧的bucketfor (size_type bucket=0;bucket<old_n;bucket++) {//指向节点所对应的的串行的起始节点node* first = buckets[bucket];//处理每一个旧的bucket所含(串行)的每一个节点while(first){//串行节点还未结束//找出节点落在哪一个新的bucket内部size_type new_bucket = bkt_num(first->val,n);//以下四个操作颇为巧妙//(1)令旧bucket指向其所对应的串行的下一个节点(以便迭代处理)buckets[bucket] = first->next;//(2)(3)将当前节点插入到新的bucket内部，成为其对应串行的第一个节点first->next = tmp[new_bucket];tmp[new_bucket] = first;//(4)回到旧的bucket所指向的待处理的串行，准备处理下一个节点first = buckets[bucket];}}//对调新旧两个buckets//离开的时候会释放tmp的内存buckets.swap(tmp);};}}}//在不需要重建表格的情况下插入新的节点 键值不允许重复std::pair<iterator,bool>insert_unique_noresize(const value_type& obj){const size_type n = bkt_num(obj) ;//决定obj应该位于 第n n bucketnode* first = buckets[n]; //令first指向bucket对应的串行头部//如果Buckets[n]已经被占用 此时first不再是0 于是进入以下循环//走过bucket所对应的整个链表for (node* cur = first;cur;cur = cur->next) {if (equals(get_key(cur->val)),get_key(obj)){//如果发现和链表中的某个键值是相同的 就不插入 立刻返回return std::pair<iterator,bool>(iterator(cur, this), false);}//离开上述循环(或者根本没有进入循环的时候)first指向bucket的所指链表的头部节点node* tmp = new_node(obj); //产生新的节点tmp->next = first;buckets[n] = tmp; //令新的节点成为链表的第一个节点++num_elements;   //节点的个数累加return std::pair<iterator,bool>(iterator(tmp,this),true);}}//客户端执行的是另外一种节点的插入行为(不再是insert_unique 而是insert_equal)//插入元素 允许重复iterator insert_equal(const value_type& obj){//判断是否需要重建表格 如果需要就进行扩充resize(num_elements+1);return insert_equal_noresize(obj);}//在不需要重建表格的情况下 插入新的节点，键值是允许重复的iterator insert_equal_noresize(const value_type& obj){const size_type n = bkt_num(obj); //决定obj应该位于第 n bucketnode* first = buckets[n];//令first指向的bucket对应的链表的头部//如果bucket[n]已经被占用，此时的first不为0，进入循环//遍历整个链表for(node* cur = first;cur;cur = cur->next){if (equals(get_key(cur->val),get_key(obj))){//如果发现与链表中的某个键值相同，就马上插入，然后返回node* tmp = new_node(obj);  //产生新的节点tmp->next = cur->next;//新节点插入目前的位置cur->next = tmp;++num_elements;return iterator (tmp, this); //返回一个迭代器 指向新增的节点}//进行到这个时候 表示没有发现重复的数值node* tmp = new_node(obj);tmp->next = first;buckets[n] = tmp;++num_elements;return iterator(tmp, this);}}void clear(){//针对每一个bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//删除bucket list中的每一个节点while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets内容为null}num_elements = 0; //令总的节点的个数为0//需要注意 buckets vector并没有释放空间，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是调用vector::clear() 造成所有的元素都为0buckets.clear();//为己方的buckets vector保留空间，使与对方相同//如果己方的空间大于对方 就不需要改变；如果己方的空间小于对方 就会增大buckets.reserve(ht.buckets.size());//从己方的buckets vector尾端开始，插入n个元素，其数值为 null 指针//注意此时buckets vector为空，所谓的尾端就是起头处buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//针对buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//复制vector的每一个元素(是一个指针，指向hashtable节点)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//针对同一个 buckets list 复制每一个节点for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登录的节点的个数(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}};template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc>
struct __hashtable_iterator{typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>hashtable;typedef __hashtable_iterator<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;
//    typedef __hash_const   静态迭代器typedef __hashtable_node<Value>node;typedef std::forward_iterator_tag iterator_category;typedef Value value_type;typedef ptrdiff_t difference_type;typedef std::size_t size_type;typedef Value& reference;typedef Value* pointer;node* cur;// 迭代器目前所指的节点hashtable* ht;//保持对容器的连接关系 (因为可能需要从bucket跳到bucket)__hashtable_iterator(node*n,hashtable* tab):cur(n),ht(tab){}__hashtable_iterator(){}reference operator*() const {return cur->val;}pointer operator->() const {return &(operator*());}iterator& operator++();iterator operator++(int);bool operator==(const iterator& it)const {return cur == it.cur;}bool operator!=(const iterator& it)const {return cur != it.cur;}
};template <class V,class K,class HF,class ExK,class EqK,class A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>&
__hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++() {const node* old = cur;cur = cur->next; //如果存在 就是他，否则进入以下的if流程if (!cur){//根据元素的数值，定位出下一个bucket，其起头处就是我们的目的地size_type bucket = ht->bkt_num(old->val);while(!cur && ++bucket < ht->buckets.size()){cur = ht->buckets[bucket];}}return *this;
}template <class V,class K,class HF,class ExK,class EqK,class A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++(int) {iterator tmp = *this;++this; //调用operator++return tmp;
}

问题

hashtable不能直接被引用，属于内置类型。不被外部使用
客户端可以使用<hash_set.h> 和 <hash_map.h>

当超过了buckets vector就进行表格的重建

    //元素查找iterator find(const key_type& key){size_type n = bkt_num(key); //首先寻找落在哪一个bucket里面node* first;//以下 从bucket list的头部开始，逐一比对每个元素的数值，比对成功就退出for (first = buckets[n];first && !equals(get_key(first->val),key);first = first->next) {}return iterator (first,this);}//元素计数size_type count (const key_type& key)const{const size_type n = bkt_num_key(key);//首先寻找落在哪一个bucket里面size_type result = 0;//遍历bucket list,从头部开始，逐一比对每个元素的数值。比对成功就累加1for(const node* cur = buckets[n];cur;cur = cur->next){if (equals(get_key(cur->val),key)){++result;}}return result;}

hash_functions

仿函数
bkt_num() 调用此处的hash function得到一个可以对hashtable进行模运算的数值
如果是char int long等整数型别，什么都不做；如果是字符串类型的比如const char* 就需要设计一个转换函数

上述代码表明 SGI hashtable无法处理上述各项型别之外的元素，比如string double float，如果想要处理这些型别是需要自行定义hash function的

#include <iostream>
#include <vector>#ifdef __STL_USE_EXCEPTIONS
#define __STL_TRY   try
#define __STL_UNWIND(action)   catch(...) { action; throw; }
#else
#define __STL_TRY
#define __STL_UNWIND(action)
#endiftemplate<class T,class Alloc>
class simple_alloc{
public:static T* allocate(std::size_t n){return 0==n?0:(T*)Alloc::allocate(n * sizeof(T));}static T* allocate(void){return (T*)Alloc::allocate(sizeof (T));}static void deallocate(T* p,size_t n){if (n!=0){Alloc::deallocate(p,n * sizeof(T));}}static void deallocate(T* p){Alloc::deallocate(p,sizeof(T));}
};namespace Chy{template <class T>inline T* _allocate(ptrdiff_t size,T*){std::set_new_handler(0);T* tmp = (T*)(::operator new((std::size_t)(size * sizeof (T))));if (tmp == 0){std::cerr << "out of memory" << std::endl;exit(1);}return tmp;}template<class T>inline void _deallocate(T* buffer){::operator delete (buffer);}template<class T1,class T2>inline void _construct(T1 *p,const T2& value){new(p) T1 (value);  //没看懂}template <class T>inline void _destroy(T* ptr){ptr->~T();}template <class T>class allocator{public:typedef T           value_type;typedef T*          pointer;typedef const T*    const_pointer;typedef T&          reference;typedef const T&    const_reference;typedef std::size_t size_type;typedef ptrdiff_t   difference_type;template<class U>struct rebind{typedef allocator<U>other;};pointer allocate(size_type n,const void * hint = 0){return _allocate((difference_type)n,(pointer)0);}void deallocate(pointer p,size_type n){_deallocate(p);}void construct(pointer p,const T& value){_construct(p,value);}void destroy(pointer p){_destroy(p);}pointer address(reference x){return (pointer)&x;}const_pointer const_address(const_reference x){return (const_pointer)&x;}size_type max_size()const{return size_type(UINT_MAX/sizeof (T));}};
}template <class Value>
struct __hashtable_node{__hashtable_node* next;Value val;
};
/** Key:         节点的实值类型* Value:       节点的键值类型* HashFun:     hash function的函数型别* ExtractKey:  从节点中提取键值的方法 (函数或者仿函数)* EqualKey:    判断键值是否相同 (函数或者仿函数)* Alloc:       空间配置器 缺省使用 std::alloc*/template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc>
class hashtable{
public:typedef Key key_type;typedef Value value_type;typedef HashFcn hasher;    //为template型别参数重新定义一个名称typedef EqualKey key_equal;//为template型别参数重新定义一个名称typedef std::size_t size_type;typedef ptrdiff_t difference_type;private://以下三者都是function objects//<stl_hash_fun.h> 定义有数个标准型别(如 int、c-style、string等)的hasherhasher hash;        //散列函数key_equal equals;   //判断键值是否相等ExtractKey get_key; //从节点取出键值typedef __hashtable_node<Value>node;//专属的节点配置器typedef simple_alloc<node,Alloc>node_allocator;//节点的配置函数node* new_node(const value_type& obj){node* n = node_allocator::allocate();n->next = 0;__STL_TRY{Chy::allocator<Key>::construct(&n->val,obj);return n;};__STL_UNWIND(node_allocator::deallocate(n);)}//节点释放函数void delete_node(node* n){Chy::allocator<Key>::destroy(n->val);node_allocator::deallocate(n);}public:std::vector<node*,Alloc>buckets;//以vector完成桶的集合，其实值是一个node*size_type num_elements;  //node的个数
public://bucket个数 即buckets vector的大小size_type bucket_count() const{return buckets.size();}//注意假设 假设long至少有32bitstatic const int __stl_num_primes = 28;constexpr static const unsigned long __stl_prime_list[__stl_num_primes] ={53,         97,         193,       389,       769,1543,       3079,       6151,      12289,     24593,49157,      98317,      196613,    393241,    786433,1572869,    3145739,    6291469,   12582917,  25165843,50331653,   100663319,  201326611, 402653189, 805306457,1610612741, 3221225473, 4294967291};//找出上述28指数中，最接近并大于n的那个质数inline unsigned long __stl_next_prime(unsigned long n){const unsigned long *first = __stl_prime_list;const unsigned long *last = __stl_prime_list + __stl_num_primes;const unsigned long *pos = std::lower_bound(first,last,n);//使用lower_bound() 需要先进行排序return pos == last ? *(last-1) : *pos;}//总共有多少个buckets。以下是hash_table的一个member functionsize_type max_bucket_count()const{//其数值将为 4294967291return __stl_prime_list[__stl_num_primes - 1];}//构造函数hashtable(size_type n,const HashFcn& hf,const EqualKey& eql):hash(hf),equals(eql),get_key(ExtractKey()),num_elements(0){initialize_buckets(n);}//初始化函数void initialize_buckets(size_type n){//例子：传入50 返回53//然后保留53个元素的空间 然后将其全部填充为0const size_type n_buckets = next_size(n);buckets.reserve(n_buckets);//设定所有的buckets的初值为0(node*)buckets.insert(buckets.begin(),n_buckets,(node*)0);}public://版本1：接受实值（value）和buckets个数size_type bkt_num(const value_type& obj, size_t n) const{return bkt_num_key(get_key(obj), n); //调用版本4}//版本2：只接受实值（value）size_type bkt_num(const value_type& obj) const{return bkt_num_key(get_key(obj)); //调用版本3}//版本3，只接受键值size_type bkt_num_key(const key_type& key) const{return bkt_num_key(key, buckets.size()); //调用版本4}//版本4：接受键值和buckets个数size_type bkt_num_key(const key_type& key, size_t n) const{return hash(key) % n; //SGI的所有内建的hash()，在后面的hash functions中介绍}public://相关对应的函数//next_size()返回最接近n并大于n的质数size_type next_size(size_type n) const {return __stl_next_prime(n);}typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;//插入操作和表格重整//插入元素不允许重复std::pair<iterator,bool>insert_unique(const value_type& obj){//判断是否需要重建表格  如果需要就进行扩充resize(num_elements + 1);return insert_unique_noresize(obj);}//函数判断是否需要重建表格 如果不需要立刻返回，如果需要 就重建表格void resize(size_type num_elements_hint){//表格重建与否的原则是：元素的个数(新增元素计入之后)和先前分配的bucket vector进行比较//如果前者的大于后者 就需要表格的重建//因此 bucket(list)的最大容量和buckets vector的大小相同const size_type old_n = buckets.size();if (old_n < num_elements_hint){//需要重新分配内存//计算下一个质数const size_type n = next_size(num_elements_hint);if (n > old_n){std::vector<node*,Alloc>tmp(n,(node*)0);__STL_TRY{//处理每一个旧的bucketfor (size_type bucket=0;bucket<old_n;bucket++) {//指向节点所对应的的串行的起始节点node* first = buckets[bucket];//处理每一个旧的bucket所含(串行)的每一个节点while(first){//串行节点还未结束//找出节点落在哪一个新的bucket内部size_type new_bucket = bkt_num(first->val,n);//以下四个操作颇为巧妙//(1)令旧bucket指向其所对应的串行的下一个节点(以便迭代处理)buckets[bucket] = first->next;//(2)(3)将当前节点插入到新的bucket内部，成为其对应串行的第一个节点first->next = tmp[new_bucket];tmp[new_bucket] = first;//(4)回到旧的bucket所指向的待处理的串行，准备处理下一个节点first = buckets[bucket];}}//对调新旧两个buckets//离开的时候会释放tmp的内存buckets.swap(tmp);};}}}//在不需要重建表格的情况下插入新的节点 键值不允许重复std::pair<iterator,bool>insert_unique_noresize(const value_type& obj){const size_type n = bkt_num(obj) ;//决定obj应该位于 第n n bucketnode* first = buckets[n]; //令first指向bucket对应的串行头部//如果Buckets[n]已经被占用 此时first不再是0 于是进入以下循环//走过bucket所对应的整个链表for (node* cur = first;cur;cur = cur->next) {if (equals(get_key(cur->val)),get_key(obj)){//如果发现和链表中的某个键值是相同的 就不插入 立刻返回return std::pair<iterator,bool>(iterator(cur, this), false);}//离开上述循环(或者根本没有进入循环的时候)first指向bucket的所指链表的头部节点node* tmp = new_node(obj); //产生新的节点tmp->next = first;buckets[n] = tmp; //令新的节点成为链表的第一个节点++num_elements;   //节点的个数累加return std::pair<iterator,bool>(iterator(tmp,this),true);}}//客户端执行的是另外一种节点的插入行为(不再是insert_unique 而是insert_equal)//插入元素 允许重复iterator insert_equal(const value_type& obj){//判断是否需要重建表格 如果需要就进行扩充resize(num_elements+1);return insert_equal_noresize(obj);}//在不需要重建表格的情况下 插入新的节点，键值是允许重复的iterator insert_equal_noresize(const value_type& obj){const size_type n = bkt_num(obj); //决定obj应该位于第 n bucketnode* first = buckets[n];//令first指向的bucket对应的链表的头部//如果bucket[n]已经被占用，此时的first不为0，进入循环//遍历整个链表for(node* cur = first;cur;cur = cur->next){if (equals(get_key(cur->val),get_key(obj))){//如果发现与链表中的某个键值相同，就马上插入，然后返回node* tmp = new_node(obj);  //产生新的节点tmp->next = cur->next;//新节点插入目前的位置cur->next = tmp;++num_elements;return iterator (tmp, this); //返回一个迭代器 指向新增的节点}//进行到这个时候 表示没有发现重复的数值node* tmp = new_node(obj);tmp->next = first;buckets[n] = tmp;++num_elements;return iterator(tmp, this);}}void clear(){//针对每一个bucketfor(size_type i = 0;i < buckets.size();++i){node * cur = buckets[i];//删除bucket list中的每一个节点while(cur != 0){node* next = cur->next;delete_node(cur);cur = next;}buckets[i] = 0; //令buckets内容为null}num_elements = 0; //令总的节点的个数为0//需要注意 buckets vector并没有释放空间，仍然保存先前的大小}void copy_from(const hashtable& ht){//先清除己方的buckets vector，此操作是调用vector::clear() 造成所有的元素都为0buckets.clear();//为己方的buckets vector保留空间，使与对方相同//如果己方的空间大于对方 就不需要改变；如果己方的空间小于对方 就会增大buckets.reserve(ht.buckets.size());//从己方的buckets vector尾端开始，插入n个元素，其数值为 null 指针//注意此时buckets vector为空，所谓的尾端就是起头处buckets.insert(buckets.end(),ht.buckets.size(),(node*)0);__STL_TRY{//针对buckets vectorfor (size_type i = 0;i<ht.buckets.size();++i) {//复制vector的每一个元素(是一个指针，指向hashtable节点)if (const node* cur = ht.buckets[i]){node* copy = new_node(cur->val);buckets[i] = copy;//针对同一个 buckets list 复制每一个节点for (node* next = cur->next;next ; cur = next,next = cur->next) {copy->next = new_node(next->val);copy = copy->next;}}}//重新登录的节点的个数(hashtable的大小)num_elements = ht.num_elements;};__STL_UNWIND(clear());}//元素查找iterator find(const key_type& key){size_type n = bkt_num(key); //首先寻找落在哪一个bucket里面node* first;//以下 从bucket list的头部开始，逐一比对每个元素的数值，比对成功就退出for (first = buckets[n];first && !equals(get_key(first->val),key);first = first->next) {}return iterator (first,this);}//元素计数size_type count (const key_type& key)const{const size_type n = bkt_num_key(key);//首先寻找落在哪一个bucket里面size_type result = 0;//遍历bucket list,从头部开始，逐一比对每个元素的数值。比对成功就累加1for(const node* cur = buckets[n];cur;cur = cur->next){if (equals(get_key(cur->val),key)){++result;}}return result;}};template <class Value,class Key,class HashFcn,class ExtractKey,class EqualKey,class Alloc>
struct __hashtable_iterator{typedef hashtable<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>hashtable;typedef __hashtable_iterator<Value,Key,HashFcn,ExtractKey,EqualKey,Alloc>iterator;
//    typedef __hash_const   静态迭代器typedef __hashtable_node<Value>node;typedef std::forward_iterator_tag iterator_category;typedef Value value_type;typedef ptrdiff_t difference_type;typedef std::size_t size_type;typedef Value& reference;typedef Value* pointer;node* cur;// 迭代器目前所指的节点hashtable* ht;//保持对容器的连接关系 (因为可能需要从bucket跳到bucket)__hashtable_iterator(node*n,hashtable* tab):cur(n),ht(tab){}__hashtable_iterator(){}reference operator*() const {return cur->val;}pointer operator->() const {return &(operator*());}iterator& operator++();iterator operator++(int);bool operator==(const iterator& it)const {return cur == it.cur;}bool operator!=(const iterator& it)const {return cur != it.cur;}
};template <class V,class K,class HF,class ExK,class EqK,class A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>&
__hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++() {const node* old = cur;cur = cur->next; //如果存在 就是他，否则进入以下的if流程if (!cur){//根据元素的数值，定位出下一个bucket，其起头处就是我们的目的地size_type bucket = ht->bkt_num(old->val);while(!cur && ++bucket < ht->buckets.size()){cur = ht->buckets[bucket];}}return *this;
}template <class V,class K,class HF,class ExK,class EqK,class A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>
__hashtable_iterator<V,K,HF,ExK,EqK,A>::operator++(int) {iterator tmp = *this;++this; //调用operator++return tmp;
}template <class Key> struct hash{};inline size_t __stl_hash_string(const char* s){unsigned long h = 0;for(;*s;++s){h = 5*h + *s;}return std::size_t (h);
}//下面所有的 __STL_TEMPLATE_NULL 在<stl_config.h>里面全部被定义为template<>int main(){const char *input_string("Hello");std::cout << input_string << std::endl;std::cout << __stl_hash_string(input_string) << std::endl;
}

参考链接

关联容器 — hashtable · STL源码分析 · 看云

STL源码剖析 hashtable相关推荐

C++ STL源码剖析笔记
写在前面记录一下<C++ STL源码剖析>中的要点. 一.STL六大组件容器(container): 各种数据结构,用于存放数据: class template 类泛型: 如vecto ...
STL（C++标准库，体系结构及其内核分析）（STL源码剖析）（更新完毕）
文章目录介绍 Level 0:使用C++标准库 0 STL六大部件 0.1 六大部件之间的关系 0.2 复杂度 0.3 容器是前闭后开(左闭右开)区间 1 容器的结构与分类 1.1 使用容器Arra ...
STL源码剖析学习七：stack和queue
STL源码剖析学习七:stack和queue stack是一种先进后出的数据结构,只有一个出口. 允许新增.删除.获取最顶端的元素,没有任何办法可以存取其他元素,不允许有遍历行为. 缺省情况下用deq ...
《STL源码剖析》学习-- 1.9-- 可能令你困惑的C++语法1
最近在看侯捷的<STL源码剖析>,虽然感觉自己c++看得比较深一点,还是感觉还多东西不是那么明白,这里将一些细小的东西或者概念记录一下. 有些东西是根据<C++编程思想>理解的 ...
《STL源码剖析》学习--6章--_rotate算法分析
最近在看侯捷的<STL源码剖析>,其中有许多不太明白之处,后经分析或查找资料有了些理解,现记录一下. <STL源码剖析>学习--6章--random access ite ...
《STL源码剖析》学习--6章--power算法分析
最近在看侯捷的<STL源码剖析>,其中有许多不太明白之处,后经分析或查找资料有了些理解,现记录一下. 6章--power算法分析书本中的算法如下所示: template <clas ...
STL源码剖析——P142关于list::sort函数
在list容器中,由于容器自身组织数据的特殊性,所以list提供了自己的排序函数list::sort, 并且实现得相当巧妙,不过<STL源码剖析>的原文中,我有些许疑问,对于该排序算法,侯 ...
STL源码剖析---红黑树原理详解下
转载请标明出处,原文地址:http://blog.csdn.net/hackbuteer1/article/details/7760584 算法导论书上给出的红黑树的性质如下,跟STL源码 ...
STL源码剖析面试问题
当vector的内存用完了,它是如何动态扩展内存的?它是怎么释放内存的?用clear可以释放掉内存吗?是不是线程安全的? vector内存用完了,会以当前size大小重新申请2* size的内存,然后 ...

STL源码剖析 hashtable

hashtable的概述

线性探测

二次探测

开链

hashtable的桶子和节点

hashtable迭代器

hashtable的数据结构

hashtable的构造和内存管理

复制和整体删除

整体代码

问题

hash_functions

参考链接

STL源码剖析 hashtable相关推荐

最新文章

热门文章