C++ STL 中 vector 内存用尽后, 为什么每次是 2 倍的增长, 而不是 3 倍或其他值?

感谢大家的回答!将答案整理如下: 1.关于为什么是指数增长?参考:C++ Made Easier: How Vectors Grow 2.关于2倍是否…
关注者
623
被浏览
220,313
登录后你可以
不限量看优质回答私信答主深度交流精彩内容一键收藏
folly/FBVector.md at master · facebook/folly · GitHub
It is well known that std::vector grows exponentially (at a constant factor) in order to avoid quadratic growth performance. The trick is choosing a good factor (any factor greater than 1 ensures O(1) amortized append complexity towards infinity). A factor that's too small causes frequent vector reallocation; one that's too large forces the vector to consume much more memory than needed. The initial HP implementation by Stepanov used a growth factor of 2, i.e. whenever you'd push_back into a vector without there being room, it would double the current capacity.

With time, other compilers reduced the growth factor to 1.5, but gcc has staunchly used a growth factor of 2. In fact it can be mathematically proven that a growth factor of 2 is rigorously the worst possible because it never allows the vector to reuse any of its previously-allocated memory. That makes the vector cache- unfriendly and memory manager unfriendly.

使用 k=2 增长因子的问题在于,每次扩展的新尺寸必然刚好大于之前分配的总和:

c \sum_{i=0}^n 2^i = c(2^{n+1} - 1) < c2^{n + 1}

也就是说,之前分配的内存空间不可能被使用。这样对于缓存并不友好。最好把增长因子设为 1 < k < 2,例如 Folly 采用 1.5,RapidJSON 也是跟随采用 1.5:

GenericValue& PushBack(GenericValue& value, Allocator& allocator) {
    RAPIDJSON_ASSERT(IsArray());
    if (data_.a.size >= data_.a.capacity)
        Reserve(data_.a.capacity == 0 ? kDefaultArrayCapacity : (data_.a.capacity + (data_.a.capacity + 1) / 2), allocator);
    data_.a.elements[data_.a.size++].RawAssign(value);
    return *this;
}

比较内存分配的情况:

k = 2, c = 4
0123
    01234567
            012345789ABCDEF
                           0123456789ABCDEF0123456789ABCDEF
                                                           012345...

k = 1.5, c = 4
0123
    012345
          012345678
                   0123456789ABCD
                                 0123456789ABCDEF0123
0123456789ABCDEF0123456789ABCD
                              0123456789ABCDEF0123456789ABCDEF...

可以看到,k = 1.5 在几次扩展之后,可以重用之前的内存空间。

其实在C++ 标准中,并没有规定 vector::push_back() 要用哪一个增长因子。这是由标准库的实现者决定的。