《more effective C++》-Item 27/28-学习笔记2

《more effective C++》-Item 27/28-学习笔记2

[TOC]
本文为《more effective C++》-Item 27/28 总结, 分别涉及控制对象生成在 heap 中与否的技术以及智能指针的细节技术.

Item 27. Requiring or prohibiting heap-based objects

一个应用的场景: if you are working on an embedded system, where memory leaks are especially troublesome and heap space is at a premium.

要求对象仅产生于 heap 之中(仅用 new 方式产生对象)

思路: This is easy to do. Nonheap objects are automatically constructed at their point of definition and automatically destructed at the end of their lifetime, so it suffices to simply make these implicit constructions and destructions illegal.

  1. 思路 1: 把 constructor 与 destructor 都声明为 private

overskill ==> pass.

  1. 思路 2: 把 destructor 都声明为 private(通过 pseudo-destructor 调用), constructor 为 public
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class UPNumber
{
public:
UPNumber();
UPNumber(int initValue);
UPNumber(double initValue);
UPNumber(const UPNumber &rhs);
// pseudo-destructor (a const member function, because
// even const objects may be destroyed)
void destroy() const { delete this; }
...
private :
~UPNumber();
};

UPNumber n;// error! (legal here, but illegal when n’s dtor is later implicitly invoked)
UPNumber *p = new UPNumber;// fine
...
delete p;// error! attempt to call private destructor
p->destroy();// fine
  1. 思路 3: 把 constructor 声明为 private(通过 pseudo-destructor 调用), destructor 为 public

drawback:

  • author must remember to declare each of them private.
    例如 copy constructor, and it may include a default constructor, too, if these functions would otherwise be generated by compilers; compiler-generated functions are always public.

it’s easier to declare only the destructor private, because a class can have only one of those.

  1. 思路 4: destructor protected (while keeping its constructors public) and to contain pointers to realize containment

上面 1-3 思路会导致的问题: prevents both inheritance and containment, 因此采用思路 4.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class UPNumber// declares dtor protected
{
...
};

class NonNegativeUPNumber : public UPNumber
{
...
};
// now okay; derived classes have access to protected members
class Asset
{
public:
Asset(int initValue);
~Asset();
...
private :
UPNumber *value;
};

Asset::Asset(int initValue)
: value(new UPNumber(initValue)){...} // fine

Asset::~Asset()
{
value->destroy();// also fine
}

判断某个对象是否位于 heap 中

结论: 没有较优的有效判断.

下面是分析过程.

在包含只能在 heap 上构造的 UPNumber 也没办法限制其自身只能在 heap 上产生.

1
NonNegativeUPNumber n;// fine

关键在于如何 object 检测是否在 heap 上. 下面提出了一些方法, 然后都有问题, 无法实现检测效果.

  1. 重载 operator new, 设置 static flag.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class UPNumber
{
public:
// 如果产生了 non-heap 对象报错/异常
class HeapConstraintViolation
{
};
static void *operator new(size_t size);
UPNumber();
...
private :
static bool onTheHeap; // static flag
...
};
// static成员变量的明确定义
bool UPNumber::onTheHeap = false;
void *UPNumber::operator new(size_t size)
{
onTheHeap = true;
return ::operator new(size);
}

UPNumber::UPNumber()
{
if (!onTheHeap)
{
throw HeapConstraintViolation();
}
proceed with normal construction here;
onTheHeap = false; // 为了下个对象重置flag
}

这种解法的问题

    1. 无法应对数组的构造, 即便把 operator new [] 也重载也无法解决, 因为上面的思路是每分配一次(operate new[] 操作)内存随即构造对象然后更新 static flag. 然后对于数组而言, 分配内存只有一次, 然后是多次的调用构造函数, 这会导致构造数组中第二个元素时报错/异常.
1
UPNumber *numberArray = new UPNumber[100];
    1. bit-setting 可能失败.
      下面的代码是希望用先生成的对象构造出一个对象再用 pn 指针指向它.
1
UPNumber *pn = new UPNumber(*new UPNumber);//暂时忽略资源泄漏的可能性问题

我们期待的执行顺序是:
一. 为第一个对象调用 operator new.
二. 为第一个对象调用构造函数.
三. 为第二个对象调用 operator new.
四. 为第二个对象调用构造函数.

但是编译器无法保证上面的顺序因而可能会导致报错/异常.

一. 为第一个对象调用 operator new.
二. 为第二个对象调用 operator new.
三. 为第一个对象调用构造函数.
四. 为第二个对象调用构造函数.

问题: bit set in steps 1 and 2 is cleared in step 3, thus making the object constructed in step 3 think it’s not on the heap, even though it is.

  1. 利用 stack 与 heap 的内存模型, 通过比较变量地址得到某个变量是否在 heap 内(unportable).

如下图 stack 从高位向地位增长, heap 从低位向高位增长. onTheStack 为局部变量在 stack 之中. address 作为函数参数如果不在 heap 中的话, 其地址肯定要小于 onTheStack 的地址. 反之则在 heap 中.

memory_structure_1.PNG

1
2
3
bool onHeap(const void *address){
char onTheStack; // local stack variable
return address < &onTheStack;}
  • 这种解法的问题: 未考虑 static 对象(包括 global scope 以及 namespace scope). 对于不同的系统, static 对象的内存位置不同. 有可能是在 heap 之下也有可能是在 heap 之上. 忽略这个内存分布的影响会使上面的做法失效(it fails to distinguish between heap objects and static objects, 如下).
1
2
3
4
5
6
7
void allocateSomeObjects()
{
char *pc = new char; // heap object: onHeap(pc) will return true
char c; // stack object: onHeap(&c) will return false
static char sc; // static object: onHeap(&sc) will return true
...
}

当然最关键的问题是其不具有可移植性.

memory_structure_2.PNG

  1. 通过 system call 实现

If you absolutely, positively have to tell whether an address is on the heap, you’re going to have to turn to unportable, implementation-dependent system calls, and that’s that.

判断是否可以安全地使用 delete

很多时候我们需要判断 object 是否在 heap 上是为了判断是否可以安全地使用 delete 析构释放指针指向的对象的内存.

  • 首先这不是一个直接可以判断的简单问题

例如下面无法直接判断 pa 是无法使用 delete 进行释放的.

1
2
3
4
5
6
7
class Asset
{
private:
UPNumber value;
...
};
Asset *pa = new Asset;
  • 思路: 重载 operator newoperator delete 实现一个 heap 内存池
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
void *operator new(size_t size)
{
void *p = getMemory(size);// 重载 operator new 分配内存
//然后把 p 放到管理 new 出来对象的 collection 记录中
add p to the collection of allocated addresses;
return p;
}
void operator delete(void *ptr)
{
releaseMemory(ptr); // 重载 operator new 释放内存
//然后把p从 collection 记录中移除
remove ptr from the collection of allocated addresses;
}
//判断某个指针是否由 new 产生
bool isSafeToDelete(const void *address)
{
return whether address is in collection of
allocated addresses;
}

这种思路的问题有三个:

  1. 覆盖了全局的 ::operator new::operator delete.
  2. 负担较重, 有些不需要 track 的 heap 上的内存也被记录在案.
  3. 很难实现一个通用的 isSafeToDelete 函数, 因为涉及到虚拟继承/虚函数后对象可能会有多个地址(there’s no guarantee that the address passed to isSafeToDelete is the same as the one returned from operator new).
  • 改进方法: abstract mixin base class

创建一个虚基类替代上面的过程, 然后让需要通过 heap 生成的类继承之即可.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class HeapTracked
{
public:
class MissingAddress
{
};
virtual ~HeapTracked() = 0;
static void *operator new(size_t size);
static void operator delete(void *ptr);
bool isOnHeap() const;

private:
typedef const void *RawAddress;
static list<RawAddress> addresses;//keep track of all pointers returned from operator new.
};

// mandatory definition of static class member, empty one
list<RawAddress> HeapTracked::addresses;

// HeapTracked’s destructor is pure virtual to make the
// class abstract. The destructor must still be
// defined, however, so we provide this empty definition.
HeapTracked::~HeapTracked() {}

void *HeapTracked::operator new(size_t size)
{
void *memPtr = ::operator new(size);
addresses.push_front(memPtr);// put its address at the front of the list
return memPtr;
}

void HeapTracked::operator delete(void *ptr)
{
// gracefully hande null pointers
if (ptr == 0)
return;
list<RawAddress>::iterator it =
find(addresses.begin(), addresses.end(), ptr);
if (it != addresses.end())
{
addresses.erase(it);
::operator delete(ptr);
}
else
{
throw MissingAddress();
}
}

bool HeapTracked::isOnHeap() const
{
// get a pointer to the beginning of the memory occupied by *this
const void *rawAddress = dynamic_cast<const void *>(this);
// look up the pointer in the list of addresses returned by operator new
list<RawAddress>::iterator it =
find(addresses.begin(), addresses.end(), rawAddress);
return it != addresses.end();
}

特点如下:

  • 使用 STL 里的 std::list 容器管理.
  • 使用全局的 ::operator new ::operator delete, 不改变全局语义.
  • 通过 dynamic_cast<const void*> 向下转型把对象的地址转变为指向 const void * 类型, 也就是 ::operator new 返回的类型. 这样就可以解决虚继承/多重继承下的多指针地址问题. 并且绝对是 portable.

使用示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Asset : public HeapTracked
{
private:
UPNumber value;
...
};

void inventoryAsset(const Asset *ap)
{
if (ap->isOnHeap())
{
ap is a heap - based asset — inventory it as such;
}
else
{
ap is a non - heap - based asset — record it that way;
}
}
  • 问题: 不支持内建类型, 但是 you’ll never want to do that with a built-in type because such types have no this pointer.

禁止对象产生于 heap 之中

objcet 的实例化有 3 种可能, 依次探讨如何实现

  1. 对象直接被实例化

直接将 operator newoperator delete 声明为 private 即可. 这里没有必要纠结是否只 private 化 operator new 或者 operator delete, 没有理由不同时 private 化它们. 如果想进一步地禁止生成 heap 的数组也可以 private 化 operator new [] 以及 operator delete [].

1
2
3
4
5
6
class UPNumber {
private:
static void *operator new(size_t size);
static void operator delete(void *ptr);
...
};
  1. 对象被实例化为 derived class objects 内的 “base class” 成分

base 的 operator newoperator delete 声明为 private 可以保证 derived class 也不能在 heap 上实例化, 但是有前提条件: derived class does not declare an operator new of its own.

  1. 对象被聚合到其他类之中

如下 Asset 类聚合 UPNumber, 此时系统会用全局的 ::operator new 或者是 Asset 类重载的 operator new 生成 UPNumber 对象, 从而导致禁止 heap 失败.

1
2
3
4
5
6
7
8
9
class Asset
{
public:
Asset(int initValue);
...
private:
UPNumber value;
};
Asset *pa = new Asset(100);//fine, calls Asset::operator new or ::operator new, not UPNumber::operator new

以上 2 与 3 的问题的解决办法与前面查询对象是否在 heap 内一样, 查询对象是否在 heap 外, 此问题没有较好的答案.

Item 28. Smart Pointer

定义: Smart pointers are objects that are designed to look, act, and feel like built-in pointers, but to offer greater functionality.

应用场景: resource management, the automation of repetitive coding tasks.

使用智能指针代替内建指针获得的指针控制权如下:

  • 构造与析构: 用于资源管理, 例如引用计数.
  • 复制和赋值: 例如控制深拷贝与浅拷贝, 控制权的转移.
  • 解引: 例如实现 lazy fetching

智能指针的一般实现模板:

1
2
3
4
5
6
7
8
9
10
11
12
template<class T> 
class SmartPtr {
public:
SmartPtr(T* realPtr = 0);
SmartPtr(const SmartPtr& rhs);
~SmartPtr();
SmartPtr& operator=(const SmartPtr& rhs);
T* operator->() const;
T& operator*() const;
private:
T *pointee;
};

总结智能的特点: using a smart pointer isn’t much different from using the dumb pointer it replaces. That’s testimony to the effectiveness of encapsulation. Clients of smart pointers are supposed to be able to treat them as dumb pointers.

智能指针的构造, 赋值, 析构

passing auto_ptr(std::unique_ptr)s by reference-to-const avoids the hazards arising from pass-by-value.

解引操作

两种解引方式: operator * 以及 operator ->.

对于 operator *:

1
2
3
4
5
6
template <class T>
T &SmartPtr<T>::operator*() const
{
perform "smart pointer" processing;
return *pointee;
}

返回值类型为引用, 不为 value 的原因: 1. 性能. 2. 避免 slicing.

对于 operator ->:

1
2
3
4
5
6
7
8
9
10
void editTuple(DBPtr<Tuple> &pt)
{
LogEntry<Tuple> entry(*pt);
do
{
pt->displayEditDialog();
//等同于如下语句
//(pt.operator->())->displayEditDialog();
} while (pt->isValid() == false);
}

返回值为内建指针或者是智能指针. 原因见注释.

判断智能指针是否为 NULL

1
2
3
4
5
SmartPtr<TreeNode> ptn;
...
if (ptn == 0) ... // error!
if (ptn) ... // error!
if (!ptn) ... // error!
  • 解法 1: 隐式类型转换操作符为 void *:
1
2
3
4
5
6
7
8
9
10
11
12
13
template<class T>
class SmartPtr {
public:
...
operator void*();
...
};

SmartPtr<TreeNode> ptn;
...
if (ptn == 0) ... // now fine
if (ptn) ... // also fine
if (!ptn) ... // fine

问题: 导致某些意想不到隐式转换导致不可思议的表象(allowing mixed-type comparisons)

1
2
3
4
SmartPtr<Apple> pa;
SmartPtr<Orange> po;
...
if (pa == po) ... // this compiles!

both smart pointers can be implicitly converted into void * pointers, and there is a built-in comparison function for built-in pointers.

  • 解法 2: 重载 operator !
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
template <class T>
class SmartPtr
{
public:
... bool operator!() const;
...
};

SmartPtr<TreeNode> ptn;
... if (!ptn)
{ // fine
...
}
else
{
...
}

问题: 无法完全复制内建指针的判断方式, 并且下面的情况依旧无法解决.

1
2
3
4
5
6
7
8
if (ptn == 0)
if (ptn)// also an error
//可以实写为 if (!!ptn)

SmartPtr<Apple> pa;
SmartPtr<Orange> po;
...
if (!pa == !po) ... // alas, this compiles

将智能指针转换为内建指针

有时候我们需要把智能指针转换为内建指针实现内建指针的功能. 例如下面:

1
2
3
4
5
class Tuple { ... }; 
void normalize(Tuple *pt);//内建指针参数
DBPtr<Tuple> pt;
...
normalize(pt); // 智能指针无法实现, error
  • 解法 1:
1
normalize(&*pt); //虽丑但有效
  • 解法 2: implicit conversion operator to a dumb pointer-to-T:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
template <class T> // as before
class DBPtr
{
public:
...
operator T *() { return pointee; }
...
};
DBPtr<Tuple> pt;
...
normalize(pt); // this now works

//但是对 NULL 判断也生效了.
if (pt == 0)
...
if (pt)
...
if (!pt)
...
  1. 问题 1: 可以偷走智能指针指向的内容, 偷偷修改.
1
2
3
4
5
void processTuple(DBPtr<Tuple>& pt)
{
Tuple *rawTuplePtr = pt; //偷走
use rawTuplePtr to modify the tuple;
}
  1. 问题 2: 无法实现连续的转换, 受制于编译器对于一次以上的类型转换的规定.

two user-defined conversions: because the conversion from a smart pointer to a dumb pointer is a user-defined conversion, and compilers are forbidden from applying more than one such conversion at a time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class TupleAccessors: public Tuple
{
public:
TupleAccessors(const Tuple *pt);
...
};
TupleAccessors merge(const TupleAccessors &ta1,
const TupleAccessors &ta2);

Tuple *pt1, *pt2;
...
merge(pt1, pt2); //对于内建指针正常

DBPtr<Tuple> pt1, pt2;
...
merge(pt1, pt2); //对于智能指针失败
  1. 问题 3: delete 操作符会寻找隐式的从智能指针对象到指针的转换函数,可能会发生多次析构的问题.

If pt owns the object it points to, that object is now deleted twice, once at the point where delete is called, a second time when pt’s destructor is invoked.

If, however, the owner of the object pointed to by pt is not the person who deleted pt, we can expect the rightful owner to delete that object again later.

1
2
3
DBPtr<Tuple> pt = new Tuple;
...
delete pt;//allowed, because implicit conversion, but break your program

结论: The bottom line is simple: don’t provide implicit conversion operators to dumb pointers unless there is a compelling reason to do so.

智能指针和与继承相关的类型转换

考虑如下继承关系:

Item28_1.PNG

对于智能指针而言 SmartPtr<Cassette>SmartPtr<MusicProduct> 是完全不相干的类, 无法实现动态多态.

  • 直观的解决办法是添加隐式类型转换符:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class SmartPtr<Cassette>
{
public:
operator SmartPtr<MusicProduct>()
{
return SmartPtr<MusicProduct>(pointee);
}
...
private :
Cassette *pointee;
};
class SmartPtr<CD>
{
public:
//隐式转换
operator SmartPtr<MusicProduct>()
{
return SmartPtr<MusicProduct>(pointee);
}
...
private :
CD *pointee;
};

问题:

  1. 每个 class 的智能指针都必须插入隐式转换. defeats the purpose of templates.
  2. must provide a conversion operator for each base class from which that object directly or indirectly inherits.
  • 更好的解法: 使用 nonvirtual template member function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
template <class T>
class SmartPtr
{
public:
SmartPtr(T *realPtr = 0);
T *operator->() const;
T &operator*() const;
template <class newType>
operator SmartPtr<newType>()
{
return SmartPtr<newType>(pointee);
}
...
};

编译器对于 overload resolution 优先级如下:

constructor 的单个参数匹配与否 -> 隐式类型转换符 -> 可实例化导出适宜类型的模板转换成员函数

可以转换的前提: If you’ve got a dumb pointer type T1* and another dumb pointer type T2*, you can implicitly convert a smart pointer-to-T1 to a smart pointer-to-T2 if and only if you can implicitly convert a T1* to a T2*.

问题:

  1. 模板函数实例化无法精确指定究竟实例化哪一级父类, 产生二义性.

all calls to conversion functions are equally good.

增加一个子类 CasSingleCassette.

1
2
3
4
5
6
7
8
9
template <class T>
class SmartPtr
{
...
};
void displayAndPlay(const SmartPtr<MusicProduct> &pmp, int howMany);
void displayAndPlay(const SmartPtr<Cassette> &pc, int howMany);
SmartPtr<CasSingle> dumbMusic(new CasSingle("Achy Breaky Heart"));
displayAndPlay(dumbMusic, 1); // error! 无法确定是转换到 MusicProduct 还是 Cassette
  1. 相对而言较复杂, 编译器可能不支持 member template. 维护人员可能不懂如何维护.
  • 结论

how we can make smart pointer classes behave just like dumb pointers for purposes of inheritance-based type conversions. The answer is simple: we can’t.

smart pointers are smart, but they’re not pointers.

智能指针与 const

类似于内建指针的指针 const 与所指向内容的 const 区别, 可以分为四种情况:

1
2
3
4
SmartPtr<CD> p; 
SmartPtr<const CD> p;
const SmartPtr<CD> p = &goodCD;
const SmartPtr<const CD> p = &goodCD;

然而与内建指针不同的是, 内建指针的内容 non-const 可以自动转换为内容 const 的指针, 智能指针无法做到(除非提供类型转换函数):

1
2
3
4
CD *pCD = new CD("Famous Movie Themes");
const CD * pConstCD = pCD; // fine
SmartPtr<CD> pCD = new CD("Famous Movie Themes");
SmartPtr<const CD> pConstCD = pCD; // error

Conversions involving const are a one-way street: it’s safe to go from non-const to const, but it’s not safe to go from const to non-const.

const 与 non-const 的单向性, 可以将其之间的关系类比于 public inheritance, 然后通过继承实现 dumb pointer 似的转换:

Item28_2.PNG

实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
template <class T>
class SmartPtrToConst
{
...
protected:
union
{
const T *constPointee;
T *pointee;
};
};
template <class T>
class SmartPtr : public SmartPtrToConst<T>
{
...
};

里面的 protected union 的技巧: 如果不设置为 union, 意味着一个智能指针里要有 2 个 dumb pointer: a dumb pointer-to-const-T in the base class and a dumb pointer-to-non-const-T in the derived class. 但是有了 union, We therefore get the advantages of two different pointers without having to allocate space for more than one.

Evaluation

虽然无法实现与 dumb pointer 一样的功能, 但是还是值得分场景使用其附加的”智能”部分的.

Try as you may, you will never succeed in designing a general-purpose smart pointer that can seamlessly replace its dumb pointer counterpart.

《more effective C++》-Item 27/28-学习笔记2

https://www.chuxin911.com/more_effective+C++_2_20211226/

作者

cx

发布于

2021-12-26

更新于

2022-11-23

许可协议