redis源码：内部数据结构

server 对象

struct redisServer {
    ...

    int hz;                     /* serverCron() calls frequency in hertz */

    redisDb *db;
    dict *commands;             /* Command table */
    dict *orig_commands;        /* Command table before command renaming. */

    aeEventLoop *el;

    ...
};

client 对象

typedef struct client {
    uint64_t id;            /* Client incremental unique ID. */
    connection *conn;
    int resp;               /* RESP protocol version. Can be 2 or 3. */
    redisDb *db;            /* Pointer to currently SELECTed DB. */
    robj *name;             /* As set by CLIENT SETNAME. */
    sds querybuf;           /* Buffer we use to accumulate client queries. */

    ....

} client;

sds

长度为0的数组

char buf[];

Arrays of Length Zero，GNU/GCC的一个扩展，长度为0的数组本身不占空间，它只是一个偏移量。

__attribute__ ((__packed__))

Specifying Attributes of Types

attribute 可以给 struct, union添加一些属性，有aligned, packed, transparent_union, unused, deprecated , may_alias 这几种。

packed：This attribute, attached to struct or union type definition, specifies that each member of the structure or union is placed to minimize the memory required. When attached to an enum definition, it indicates that the smallest integral type should be used.

list

dict

c语言本身不支持dict/map类型，redis需要在底层实现一个dict类型。

dict是为了解决查找的问题：一类是基于各种平衡树，一类是基于哈希表。

dict是一个key-value的映射集合，每个key都是唯一的，这样每个（key，value）对也是唯一的。

hash算法将key转换成另外一个index，存放在数组中。通过key查找到对应的value的过程就是先计算出index，然后返回数组下标为index的元素。

计算index的过程，不同的key可能会产生相同的index，成为”冲突”，解决冲突的方式一般是链地址法，就是在index处维护一个链表，将相同index的（key,value)通过链表保存起来。

当链表过长的时候，查找过程也会变慢，需要rehash，减小链表长度。

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
    int (*expandAllowed)(size_t moreMem, double usedRatio);
} dictType;

/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
} dict;

/* If safe is set to 1 this is a safe iterator, that means, you can call
 * dictAdd, dictFind, and other functions against the dictionary even while
 * iterating. Otherwise it is a non safe iterator, and only dictNext()
 * should be called while iterating. */
typedef struct dictIterator {
    dict *d;
    long index;
    int table, safe;
    dictEntry *entry, *nextEntry;
    /* unsafe iterator fingerprint for misuse detection. */
    long long fingerprint;
} dictIterator;

dictht 就是 dict hash table的意思。

dictEntry 就是一个（key，value）健值对。

dictType 中的 hashFunction 就是将key转换成index的哈希算法。

redis 目前使用两种不同的哈希算法：

MurmurHash2 是种32 bit 算法：这种算法的分布率和速度都非常好；Murmur哈希算法最大的特点是碰撞率低，计算速度快。具体信息请参考 MurmurHash 的主页： http://code.google.com/p/smhasher/
基于 djb 算法实现的一个大小写无关散列算法：具体信息请参考 http://www.cse.yorku.ca/~oz/hash.html

当 dict->ht[0]快要装满的时候（具体参数可以配置），就需要rehash，申请一个更大的dict->ht[1]，然后将dict->ht[0] 复制到 dict->ht[1] 中。

redis是单线程，需要分多次rehash，否则会阻塞客户端请求。

rehash会产生的问题

rehash的过程，会使用两个哈希表，创建了一个更大空间的ht[1]，此时会造成内存陡增；
rehash的过程，可能涉及大量KV键值对dictEntry的搬运，耗时较长；

如果这个 rehash 过程必须将所有键值对迁移完毕之后才将结果返回给用户，这样的处理方式将不满足Redis高效响应的特性。

rehash会产生的问题，主要层面就是内存占用陡增、和处理耗时长的问题，基于这两点，还会带来其他影响。

首页

redis源码：内部数据结构

server 对象

client 对象

sds

长度为0的数组

list

dict

ziplist

quicklist

intset

目录