《more effective C++》-Item 1-26-学习笔记 1
[TOC]
本文为《more effective C++》的学习笔记上半部, 涵盖内容 Item 1-26.
Basic
Item 1. 仔细区别 pointer 和 reference
NO a null reference.
- 确实存在”漏洞”, 如下, 但是 The results are undefined, 不要这么干.
1 | char *pc = 0; // set pointer to null |
- 有时也是 reference 的优势, 不用去刻意判断输入的引用是否为 null, 而指针必须要去判断这一点.
1 | void printDouble(const double &rd) |
must always refer to an object. 不能改变指向对象.
1 | stringstrings1("Nancy"); |
声明时必须初始化
因此类中的 const
member 必须通过初始化列表进行显示初始化, 而不能在构造函数体内进行初始化.
存在适合 reference 的场景
可读性意义上 syntactic requirements, 例如 implementing certain operators, operator[]
.
1 | vector<int> v(10); |
不存在引用的引用, 但是存在指针的指针, 即嵌套性上.
PS. 这条是我自己加上去的.
Item 2. 最好使用 C++ 转型符号
C-style casts 在 C++ 中的问题:
对于不同性质的转换不做区分, make no some distinctions.
There is a great difference, for example, between a cast that changes a pointer-to-const-object into a pointer-
to-non-const-object (i.e., a cast that changes only the constness of an object) and a cast that changes a pointer-to-base-class-object into a pointer-to-derived-class-object (i.e., a cast that completely changes an
object’s type).they are hard to find.
即便用grep
查找都不好查找到.
C++ 引入 4 种显示转换操作符
static_cast
has basically the same power and meaning as the general-purpose C-style cast.- also has the same kind of restrictions with C-style casts.
- can’t remove constness from an expression(C-style casts 会 remove).
const_cast
: cast away the constness or volatileness of an expression.dynamic_cast
: perform safe casts down or across an inheritance hierarchy.- cast pointers or references to base class objects into pointers
- or references to derived or sibling base class objects.
- 指示操作成功的返回值: Failed casts are indicated by a null pointer (when casting pointers) or an exception (when casting references)
- cannot be applied to types lacking virtual functions.
- cannot cast away constness.
reinterpret_cast
- perform type conversions whose result is nearly always implementation-defined. ==> rarely portable.
- The most common use is to cast between function pointer types. 除非万不得已不要使用.
1 | typedef void (*FuncPtr)(); |
注意作者非常严谨, 转换的是 expression 而不是 variable.
如果编译器尚未支持新式转型可以使用宏来代替, 待支持新式转型后方便移植
1 |
|
注意对 dynamic_cast
使用宏: there is no way to tell if the cast fails.
1 |
总结
虽然没有 C-style casts 简洁, 作者给出了使用 C++ cast 的理由
- precision of meaning and easy recognizability.
- easier to parse (both for humans and for tools), and they allow compilers to diagnose casting errors that would otherwise go undetected.
- perhaps making casts ugly and hard to type is a good thing. 也就是不鼓励程序员进行类型转换.
拓展部分: 进一步了解 reinterpret_cast
- 指针与
void*
之间的转换
static_cast
与 reinterpret_cast
怎么选?
static_cast
ing a pointer to and from void*
preserves the address. That is, in the following, a
, b
and c
all point to the same address:
1 | int* a = new int(); |
reinterpret_cast
only guarantees(由标准保证的) that if you cast a pointer to a different type, and then reinterpret_cast
it back to the original type, you get the original value. a
and c
contain the same value, but the value of b
is unspecified.
1 | int* a = new int(); |
结论: For casting to and from void*
, static_cast
should be preferred.
- 一个典型的应用场景: 定义库与用户之间的指针接口
One case when reinterpret_cast
is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control.
Here’s a contrived example where a vendor provides an API for storing and retrieving arbitrary global data:
1 | // vendor.hpp |
To use this API, the programmer must cast their data to VendorGlobalUserData
and back again. static_cast
won’t work, one must use reinterpret_cast
:
1 | // main.cpp |
Below is a contrived implementation of the sample API:
1 | // vendor.cpp |
疑问解答: why doesn’t the vendor API use void*
for that? because then they lose (some) type-checking at compile time.
- 从一个例子说明
reinterpret_cast
not portable 的原理
byte order (endianness). Let’s imagine the example: you have to read binary 32bit number from file, and you know it is big endian. Your code has to be generic and works properly on big endian (e.g. some ARM) and little endian (e.g. x86) systems. So you have to check the byte order.
You can write a function to achieve this:
1 | /*constexpr*/ bool is_little_endian() { |
Explanation: the binary representation of x
in memory could be 0000'0000'0000'0001
(big endian) or 0000'0001'0000'0000
(little endian). After reinterpret_cast
ing the byte under p
pointer could be respectively 0000'0000
or 0000'0001
. If you use static_cast
ing, it will always be 0000'0001
, no matter what endianness is being used.
But this is often surprisingly the best reason to use it.
原理:
对于 static_cast
:
When you convert for example int(12)
to unsigned float (12.0f)
your processor needs to invoke some calculations as both numbers has different bit representation. This is what stands for.
然而 reinterpret_cast
:
On the other hand, when you call the CPU does not invoke any calculations. It just treats a set of bits in the memory like if it had another type. So when you convert int*
to float*
with this keyword, the new value (after pointer dereferecing) has nothing to do with the old value in mathematical meaning (ignoring the fact that it is undefined behavior to read this value).
Be aware that reading or modifying values after reinterprt_cast
ing are very often Undefined Behavior. In most cases, you should use pointer or reference to std::byte
(starting from C++17) if you want to achieve the bit representation of some data, it is almost always a legal operation. Other “safe” types are char
and unsigned char
, but I would say it shouldn’t be used for that purpose in modern C++ as std::byte
has better semantics.
以上整理自: https://stackoverflow.com/questions/573294/when-to-use-reinterpret-cast
Item 3. 绝不要以多态方式处理数组(数组不要放要多态下的类)
作为多态的一种形式: C++ also allows you to manipulate arrays of derived class objects through base class pointers and references. 但这只是利用了指针的一种特性: monikers, 但指针天然的另一种特性 pointer arithmetic 也被一同带进来了导致了应用在数组这种内存顺序排列的结构出现问题.
- 访问: 当用 base 类定义的指针访问 derived 类对象数组时, 由于数组依靠
i*sizeof(数组中对象)
来进行随机访问, 然而 derived 类一般比 base 类大, 这样产生的结果是随机访问是不可定义的. - 析构: 可能出于很多人的意料,
delete []
也是依靠 pointer arithmetic 来实现的. 注意析构的顺序是与构建的顺序是相反的.
1 | delete [] array; |
结论
Polymorphism and pointer arithmetic simply don’t mix. Array operations almost always involve pointer arithmetic, so arrays and polymorphism don’t mix.
designing your software so that concrete classes never inherit from one another has many benefits. 因为这样能从源头上防止 base-derived 之间内存切割问题的产生.
拓展部分
如果真的需要通过多态的形式访问连续的对象结构的话,我觉得有 2 种思路:
- 在随机访问的数据结构里存放对象的指针(也许是
std::shared_ptr
), 而不是对象本身. - 我们也许可以使用队列, 树等非随机访问结构. 但是无法使用指针的 pointer arithmetic, 需要使用比较繁琐的两级解引用, 例如
(ptr->next)->fun()
.
不过无论哪一种方法都没有随机访问数据结构带来的效率.
Item 4. 避免无理由的 default constructor
最理想的状态 In a perfect world,
- classes in which objects could reasonably be created from nothing would contain default constructors and
- classes in which information was required for object construction would not.
然而现实没有这么理想, 有些场景会产生”隐性”的要求(to-provide-a-default-constructor-or-not-to-provide-a-default-constructor dilemma).
对 default constructor 的要求
- 类数组要求元素拥有 default constructor
- 绕过方式1: 明确定义好每个对象, 缺点: 不能在 heap 上分配.
1 | int ID1, ID2, ID3, ..., ID10; // 构造函数参数 |
- 绕过方式2: 使用指针数组, 缺点: 1. 需要手动维护资源, 2. total amount of memory you need increases 多加了一个指针的容量.
1 | typedef EquipmentPiece* PEP; |
- 绕过方式3: 先分配 raw memory 然后通过
placement new
构造. 需要先析构对象(逆序地), 然后用operator []
释放内存, 需要严格遵守这个步骤.
1 | // allocate enough raw memory for an array of 10 |
但是注意上面的绕过方式 doesn’t show you how to bypass required constructor arguments(还是要通过 for 一个个地构造出来).
- template-based container class.
本质上 inside the template, an array of the template parameter type is being created. 用代码翻译过来就是:
1 | template <class T> |
In most cases, careful template design can eliminate the need for a default constructor. 例如 std::vector
.
Unfortunately, many templates are designed in a manner that is anything but careful. classes without default constructors will be incompatible with many templates.
因此无法假定什么时候模板支持什么时候不支持 default constructor-ness.
- virtual base class 如果没有 default constructor 的话, 不管多深的 derived class 都得提供 base class 的初始化参数, 是场灾难.
走另一个极端, 所有类都添加 default constructor
利用 magic ID 指示一个 default construct object.
1 | class EquipmentPiece |
问题:
- there is no longer any guarantee that the fields of an
EquipmentPiece
object have been meaningfully initialized. - most member functions must check to see if the ID is present.
If member functions have to test to see if fields have truly been initialized, clients of those functions have to pay for the time those tests take.
建议: it’s best to avoid them in classes where they make no sense.
对于需要 default constructor 而没有提供定义的一般做法
throw an exception or they call a function that terminates the program.
操作符
Item 5. 对定制的”类型转换函数”保持警觉
自定义类型转换以取得 complier 对 implicit type conversions 的控制.
Two kinds of functions allow compilers to perform such conversions: single-argument constructors and implicit type conversion operators.
注意对于后者 You aren’t allowed to specify a type for the function’s return value, because the type of the return value is basically just the name of the function.
滥用类型转换函数后果:
their presence can lead to the wrong function being called (i.e., one other than the one intended). 也就是影响 overload resolution.
例如 single-argument constructors 下的一个例子:
1 | template <class T> |
直觉的打补丁:
1 | for (int i = 0; i < 10; ++i) |
这种做法不好的理由: tremendously inefficient, because each time through the loop we both create and destroy a temporary Array<int>
object.
- 解决办法:
- 对于隐式类型转换操作符–>使用成员函数显式调用. (这也是 C++ 标准中
std::string
不会隐性转换成 C-stylechar*
而使用显式的c_str()
的原因).
1 | operator aDouble() const; |
对于单自变量 constructor ->
explicit
关键字.对于单自变量 constructor, 不支持
explicit
关键字/或者有更多其他需求 -> 使用 local class 作为单变量参数(proxy classes).
1 | template<class T> |
这里利用的原理是: no sequence of conversions is allowed to contain more than one user-defined conversion. 说明的例子如下:
1 | bool operator==(const Array<int> &lhs, |
拆解 compiler 对 if (a == b[i])
的处理:
- Compilers need an object of type
Array<int>
on the right-hand side of the==
in order to calloperator==
forArray<int>
objects, but there is no single-argument constructor taking anint
argument. - compilers cannot consider converting the int into a temporary
ArraySize
object and then creating the necessaryArray<int>
object from this temporary, because that would call for two user-defined conversions, one from int toArraySize
and one fromArraySize
toArray<int>
.
最终的结论: don’t provide conversion functions unless you’re sure you want them.
Item 6. 区别 icrement/decrement 操作符的前置与后置形式
overload resolution 里没有关于参数位置的规则, 那语法是如何实现前置与后置的区分呢? 通过增加一个不会用到的参数来区分!
postfix forms take an int
argument, and compilers silently pass 0 as that int
when those functions are called.
1 | class UPInt // "unlimited precision int" |
可以看到两者的 signature 是不一样的: 其中包括返回值类型: prefix forms return a reference, postfix forms return a const
object.
实现代码如下:
1 | class UPInt {// "unlimited precision int" |
- Q: postfix form: why return a
const
object?
解释的思路: when in doubt, do as the int
s do .
下面有什么不妥吗? i
would be incremented only once. This is counterintuitive and confusing, so it’s best prohibited.
因此 const
可以从语法上禁止这么用.
1 | int i; |
这是一个绝佳的例子: If you’ve ever wondered if it makes sense to have functions return const
objects, now you know: sometimes it does, and postfix increment and decrement are examples.
- efficiency
postfix form creates an explicit temporary object (oldValue
) that has to be constructed and destructed.
因此建议是: should prefer prefix increment to postfix increment unless they really need the behavior of postfix increment.
PS. 不过现在的编译器优化会把这个差异优化掉, 但是作为思想还是值得掌握的.
- consistency
前置与后置在实现增/减的效果上必须保证完全一致, 如何做到呢?
That principle is that postfix increment and decrement should be implemented in terms of their prefix counterparts.
==> You then need only maintain the prefix versions. ==> automatically behave in a consistent fashion.
注解: postfix form 的 int
参数没有名字的原因: Many compilers issue warnings if you fail to use named parameters in the body of the function to which they apply, and this can be annoying.
Item 7. 千万不要重载 &&
||
operator ,
三个运算符
- 重载
&&
||
会破坏 short-circuit evaluation.
because you are replacing short-circuit semantics with function call semantics.
short-circuit evaluation 的两个特点:
- once the truth or falsehood of an expression has been determined, evaluation of the expression ceases, even if some parts of the expression haven’t yet been examined.
- always evaluates its arguments in left-to-right order.
重载的效果如下:
1 | if (expression1 && expression2) ... |
- 重载
operator ,
很难保证与基础operator ,
一致的语义性
The comma operator is used to form expressions.
规则:
- 从左到右
- 返回右侧
An expression containing a comma is evaluated by first evaluating the part of the expression to the left of the comma, then evaluating the expression to the right of the comma; the result of the overall comma expression is the value of the expression on the right.
不管是 non-member 还是 member function, 其参数的计算顺序都是无法确定的, 因此重载 operator ,
无法实现一般 operator ,
一致的语义.
除了上面说的 operator ,
&&
||
下面的运算符语法上无法重载:
1 | . .* :: ?: |
Item 8. 理解不同意义的 new
和 delete
new
operator 与 operator new
的区别
先说结论:
new
operator 调用operator new
来完成其第一阶段的任务,operator new
是可以通过重载实现自定义的.
new
operator is built into the language and, like sizeof
, you can’t change its meaning.new
operator 依次完成 2 个工作(you can’t change its behavior in any way):
new
operator 调用operator new
分配大小正确的内存.
The operator new
function is usually declared like this:
1 | void * operator new(size_t size);//returns a pointer to raw, uninitialized memory. |
- You can overload
operator new
by adding additional parameters, but the first parameter must always be of typesize_t
. operator new
knows nothing about constructors.
new
operator 调用 constructor 对分配的内存设置初值.
注意: you can’t directly call the constructor necessary to initialize the object (including such crucial components as its vtbl).
整个过程如下代码:
1 | string *ps = new string("Memory Management"); |
placement new
: 替换已经有值的内存为其他值的过程中得到内存指针.
上面介绍的 new
operator 不支持手动调用构造函数, 但有时候我们想要调用, 怎么办? ==> placement new
: you have some raw memory that’s already been allocated, and you need to construct an object in the memory you have.
需要包含 <new>
或者是 <new.h>
头文件.
1 | class Widget { |
placement new
的原型:
1 | void * operator new(size_t, void *location)//unused (but mandatory) size_t parameter has no name |
用途举例: applications using shared memory or memory-mapped I/O, because objects in such applications must be placed at specific addresses or in memory allocated by special routines.
删除与内存释放
与 new
operator / operator new
对应的概念是 delete
operator / operator delete
.
delete
operator 的过程:
1 | string *ps; |
- C++ equivalent of calling
malloc
andfree
deal only with raw, uninitialized memory, you should bypass thenew
anddelete
operators entirely. Instead, you should calloperator new
to get the memory andoperator delete
to return it to the system:
1 | void *buffer = operator new(50*sizeof(char)); |
placement new
创建的对象无法用delete
operator 删除, 需要先调用配套的资源清除函数. 原因:placement new
just returned the pointer that was passed to it. Who knows where that pointer came from?
如下为正确的处理方式:
1 | void * mallocShared(size_t size); |
数组
数组与单个 object 的动态创建有 2 点不同:
new
operator 调用operator new[]
, 而不是operator new
.
operator new[]
can be overloaded as well.
- For arrays, a constructor must be called for each object in the array.
异常
异常的设计是不完全的: There is as yet no agreement on a body of techniques that, when applied routinely, leads to software that behaves predictably and reliably when exceptions are thrown.
为什么不沿用 C 的 error code 式的反馈机制呢?
If a function signals an exceptional condition by setting a status variable or returning an error code, there is no way to guarantee the function’s caller will check the variable or examine the code.
As a result, execution may continue long past the point where the condition was encountered. If the function signals the condition by throwing an exception, however, and that exception is not caught, program execution immediately ceases. 异常能快速在需要处理的点停止程序, 然而一旦 error code 没有被处理, 往下传播, 无法预测后面的结果.using
setjmp
andlongjmp
.longjmp
fails to call destructors for local objects when it adjusts the stack.
Item 9. 利用 destructor 避免资源泄漏
- 不要把资源释放的
delete
单独拿出来执行, 否则会在异常抛出的时候程序终止可能不会执行delete
, 因而造成资源泄漏.
利用的原理是: because local objects are always destroyed when leaving a function, regardless of how that function is exited. (The only exception to this rule is when you call longjmp
).
- 做法:
- 使用智能指针(pointer-like object), 也是在智能指针的 destructor 中执行的
delete
. 至于时机: delete what it points to when the pointer-like object goes out of scope.
注意由于没有用delete []
. It is not suitable for use with pointers to arrays of objects. - 把
delete
放入对象 destructor 中, 析构时释放资源.
1 | // class for acquiring and releasing a window handle |
使用
try
catch
捕捉可能的异常后执行delete
(此方法不可取).自然而然的疑问
But what happens if an exception is thrown while you’re in the process of acquiring a resource, e.g., while you’re in the constructor of a resource-acquiring class? What happens if an exception is thrown during the automatic destruction of such resources? Don’t constructors and destructors call for special techniques? They do. 在 Item 10, 11 中有答案.
Item10. 在 constructor 内阻止资源泄漏
- 现象
C++ guarantees it’s safe to delete null pointers.
C++ destroys only fully constructed objects, and an object isn’t fully constructed until its constructor has run to completion.
非 fully constructed 的情景例如: 包含 sub-object, 在 object 的 constructor 里 construct sub-object.
因此在主 objcet 里的 constructor 里中途抛出异常会导致已经构造完成的 sub-object 无法会被销毁, 导致资源泄漏.
智能指针(指向主 object)也无法解决此问题: Using the smart pointer class instead of a raw won’t do you any good either, because unless the new
operation succeeds.
- Q: why C++ refuses to call destructors for objects that haven’t been fully constructed?
destructor 得去明白哪些 sub-object 需要析构释放资源, 哪些还没有构造完成不需要, overhead!
the destructor could check the bits and (maybe) figure out what actions to take. Such bookkeeping would slow down constructors, and it would make each object larger, too.
- Q: How about non-pointer data members? 同样的理由下, 不需要担心吗?
No need to worry. They are automatically initialized before a class’s constructor is called(already been fully constructed). As fully constructed objects, these data members will be automatically destroyed even if an exception arises in the constructor.
这也是后面能够通过智能指针(指向 sub-object)解决此问题的原理.
解决思路
- 解决思路 1: constractor 内
catch
所有异常后delete
, 然后再把异常往下传递
1 | BookEntry::BookEntry(const string& name, |
此思路的问题: const
类指针, 只能使用初始化列表. 无法使用 try
and catch
. because try
and catch
are statements, and member initialization lists allow only expressions.
1 | class BookEntry { |
- 解决思路 2: 对 sub-object 分别使用 private 成员函数进行初始化
member initialization lists 路不通 ==> One possibility is inside private member functions that return pointers with which sub-objects should be initialized. 此种做法的缺陷: maintenance headache.
1 | class BookEntry |
- 解决思路3: 智能指针(指向 sub-object)
1 | class BookEntry { |
Item 11. 禁止异常流出 destructor 之外
问题
- 背景: destructor 被调用的两个时机:
- 类离开 scope 正常销毁
- 异常处理: stack-unwinding(栈展开)–exception-handling mechanism during the stack-unwinding part of exception propagation.
- 禁止异常流出 destructor 之外的理由有 2 个:
- it prevents
terminate
from being called during the stack-unwinding part of exception propagation.
机制: if control leaves a destructor due to an exception while another exception is active, C++ calls the terminate function. it terminates it immediately; not even local objects are destroyed.
这个机制有点拗口, 用实例说明:
1 | class Session |
由于 exception 导致被 call 的 ~Session()
中, 出现 logDestruction(this);
也抛出来异常的时候, 会直接 terminate
, 导致 *this
资源泄漏.
- it helps ensure that destructors always accomplish everything they are supposed to accomplish.
如果构造析构函数里还有”副作用”, 副作用也不会被正常处理. 例如下面的 transaction 在logDestruction(this);
抛出异常被terminate
之后, 无法正常被处理.
1 | Session::Session() |
解决办法
1 | Session::~Session(){ |
如果画蛇添足在 catch
中增加操作会增加 terminate
的风险:
1 | Session::~Session() |
Item 12. 了解”抛出一个 exception”与”传递一个参数”或”调用一个虚函数”之间的差异
调用函数与 catch
传递异常的形式是一样的:
1 | class Widget |
但是存在差异–例如, 控制权返回与否:
When you call a function, control eventually returns to the call site (unless the function fails to return), but when you throw an exception, control does not return to the
throw
site.
C++ specifies that an object thrown as an exception is copied.
因为需要 propagate 异常, 所以 catch
需要 make a copy, 即便 local object 离开了其 scope.
copy 与否导致一些区别:
catch
对 object 做的改变不会影响外面的 object.- 异常对参数的处理要比函数处理慢很多.
PS. 存在返回值优化, 所以编译器可能会把 copy 优化掉.
This copying occurs even if the object being thrown is not in danger of being destroyed. 例如 static
变量.
1 | // function to read the value of a Widget from a stream |
异常处理与常规函数调用之间详细的区别:
- When an object is copied for use as an exception, the copying is performed by the object’s copy constructor. This copy constructor is the one in the class corresponding to the object’s static type, not its dynamic type.
1 | class Widget { ... }; |
This behavior may not be what you want, but it’s consistent with all other cases in which C++ copies objects. Copying is always based on an object’s static type.
- no copy is made when the exception is rethrown.
1 | catch (Widget &w) |
结论: In general, you’ll want to use the throw;
syntax to rethrow the current exception. 原因有 2:
- there’s no chance that that will change the type of the exception being propagated.
- more efficient.
Passing a temporary object to a non-const reference parameter is not allowed for function calls, but it is for exceptions.
creation of two copies of the thrown object, when catch-by-value.
catch (Widget w) ...
.throwing exceptions by pointer = pass by pointer: Either way, a copy of the pointer is passed.
因此需要注意 local object will be destroyed when the exception leaves the local object’s scope. The catch clause
would then be initialized with a pointer to an object that had already been destroyed.一般而言 no implicit conversion in
catch
(possibly modified byconst
orvolatile
).
1 | void f(int value) |
例外如下:
Two kinds of conversions are applied when matching exceptions to catch
clauses.
inheritance-based conversions.
This inheritance-based exception-conversion rule applies to values, references, and pointers in the usual fashion.from a typed to an untyped pointer.
catch
clause taking aconst void*
pointer will catch an exception of any pointer type.
catch
clauses are always tried in the order of their appearance
例如: exception 类型匹配上采取 first fit, 而虚函数采取 best fit. 因此设计上要先执行子类判断再执行父类判断, 否则子类判断会被跳过.
never put a
catch
clause for a base class before a catch clause for a derived class.
1 | try |
总结
- exception objects are always copied; when caught by value, they are copied twice. Objects passed to function parameters need not be copied at all.
- objects thrown as exceptions are subject to fewer forms of type conversion than are objects passed to functions.
catch
clauses are examined in the order in which they appear in the source code, and the first one that can succeed is selected for execution. When an object is used to invoke a virtual function, the function selected is the one that provides the best match for the type of the object, even if it’s not the first one listed in the source code.
Item 13. 以 by-reference 方式捕捉 exception
by-pointer 捕捉 exception 的问题
- local exception object 生存周期问题, 解决办法: 声明为 static/global. 但是不好管理.
1 | class exception // from the standard C++ library exception hierarchy |
- heap 上
new
对象传递指针, 无法判断传过来的指针是 head 上的对象的地址, 还是 stack 上对象的地址. 因此无法决定是否delete
. 有可能导致资源泄漏.
1 | void someFunction() |
- 4 个标准的 exception:
bad_alloc
(thrown when operatornew
can’t satisfy a memory request) ,bad_cast
( (thrown when adynamic_cast
to a reference fails),bad_typeid
(thrown whentypeid
is applied to a dereferenced null pointer),bad_exception
(available for unexpected exceptions)都是对象不是对象指针, 因此不可行.
by-value 捕捉 exception 的问题
- 效率低, 捕捉对象会被复制 2 次.
- slicing 问题, 子类的特征会被切除. derived class exception objects caught as base class exceptions have their derivedness “sliced off.”
1 | class exception |
catch-by-reference 是最好的选择
exception objects are copied only once.
Item 14. 明智运用 exception specifications
C++ 11 depressed, C++ 17 废弃了 exception specifications.
因此本条目略过.
Item 15. 了解异常处理的成本
- 允许时期处理 exception 需要进行大量的薄记工作, 对象的抛出处理也是有代价的.
即便代码中不用 exception,也很难保证库之类的里面没有.
If any part of a program uses exceptions, the rest of the program must support them, too. Otherwise it may not be possible to provide correct exception-handling behavior at runtime.不能指望编译器通过优化在不用异常的情况下干掉 exception support.
It may also be an attractive optimization for libraries that eschew exceptions, provided they can guarantee that exceptions thrown from client code never propagate into the library. This is a difficult guarantee to make, as it precludes client redefinitions of library-declared virtual functions; it also rules out client-defined callback functions.
- 使用
try
语句会使代码膨胀 5-10%.
Compared to a normal function return, returning from a function by throwing an exception may be as much as three orders of magnitude slower.
- 结论: 谨慎使用异常, 测试其是否为效率下降的 contributing factor, 如果是就优化设计.
效率
本章介绍 2 方面的效率:
- language-independent, focusing on things you can do in any programming language.
- C++ itself
- High-performance algorithms and data structures are great.
- sloppy implementation practices can reduce their effectiveness considerably. 例如: creating and destroying too many objects.
Item16. 谨记 80/20 法则
对于空口下判断的人: Such assessments are generally delivered with a condescending sneer, and usually both the sneerers and their prognostications are flat-out wrong.
找对瓶颈不能靠猜, 要靠程序分析器(program profiler, directly measures the resources you are interested in), profile your software using as many data sets as possible, must ensure that each data set is representative.
Item 17. 考虑使用 lazy evaluation
lazy evaluation: 先记录下来任务, 不到真正需求的一刻, 不用执行, 需要执行的时候, 只执行当时需要的那一部分. 缺点也是有的, 有时候总归要做的, 推迟了做有可能导致后面算不过来, 要灵活运用.
You write your classes in such a way that they defer computations until the results of those computations are required. If the results are never required, the computations are never performed.
应用举例
- 例子 1 引用计数
1 | string s1="Hello"; |
相对应的概念是: eager evaluation: making a copy of s1
and putting it into s2
just because the String
copy constructor was called.
具体的过程: typically entail allocating heap memory via the new
operator and calling strcpy
to copy the data in s1
into the memory allocated by s2
.
只在其中一个被修改时才改变共享的状态.
s2
可以与 s1
共享数据, 代价是 All we have to do is a little bookkeeping so we know who’s sharing what, and in return we save the cost of a call to new and the expense of copying anything.
- 例子 2 区分读写
1 | string s="Hello"; |
By using lazy evaluation and proxy classes as described in Item 30, however, we can defer the decision on whether to take read actions or write actions until we can determine which is correct.
- 例子 3 lazy fetching 缓式取出
很大的数据对象由很多个 field Value 组成, 不用一开始就全部准备好, 可以通过在大数据对象里存储 field 的指针的形式, 用到哪个初始化哪个.
to read no data from disk when a LargeObject
object is created. Instead, only the “shell” of an object is created, and data is retrieved from the database only when that particular data is needed inside the object.
对于 const
member functions, 可以用 mutable
关键字修饰成员变量, 保证需要修改指针时能够成功.
1 | class LargeObject |
如果编译器不支持 mutable
关键字, 可以使用 fake this
指针. whereby you create a pointer-to-non-const that points to the same object as this
does. When you want to modify a data member, you access it through the “fake this
” pointer:
1 | const string &LargeObject::field1() const |
it’s tedious and error-prone to have to initialize all those pointers to null, 解决办法: 可以用智能指针代替.
- 例子 4 lazy expression evaluation
延迟计算的合理性: No good programmer would deliberately compute a value that’s not needed, but during maintenance, it’s not uncommon for a programmer to modify the paths through a program in such a way that a formerly useful computation becomes unnecessary.
- 记录下操作对象以及操作内容, 也许后面不需要的话, 可以不进行. sets up a data structure inside
m3
that indicates thatm3
’s value is the sum ofm1
andm2
. Such a data structure might consist of nothing more than a pointer to each ofm1
andm2
, plus an enum indicating that the operation on them is addition.
1 | template<class T> |
- need only part of a computation 执行部分操作
1 | cout << m3[4]; //只需要计算第五行即可,APL等计算库的原理 |
付出的代价: to maintain data structures that can store values, dependencies, or a combination of the two; and to overload operators like assignment, copying, and addition, lazy evaluation in a numerical domain is a lot of work.
Summary
Indeed, if all your computations are essential, lazy evaluation may slow you down and increase your use of memory.
Lazy evaluation is only useful when there’s a reasonable chance your software will be asked to perform computations that can be avoided.
If your profiling investigations (see Item 16) show that class’s implementation is a performance bottleneck, you can replace its implementation with one based on lazy evaluation.
Item 18. 分期摊还预期的计算成本
over-eager evaluation: over-eager evaluation: doing things before you’re asked to do them.
- 策略 1: 提前计算并存储起来供查询
DataCollection
里的 min, max, avg 值经常被用的话, 可以在 object 被初始化时就计算出来, 查询比计算快的多.
1 | template <class NumericalType> |
机制: The idea behind over-eager evaluation is that if you expect a computation to be requested frequently, you can lower the average cost per request by designing your data structures to handle the requests especially efficiently.
策略 2: use a local cache
One of the simplest ways to do this is by caching values that have already been computed and are likely to be needed again. 缓存的话, 可以考虑使用std::map
, 提升查询的效率.(strategy is to use a local cache to replace comparatively expensive database queries with comparatively inexpensive lookups in an in-memory data structure.)策略 3: Prefetching
You can think of prefetching as the computational equivalent of a discount for buying in bulk.
机制: locality of reference phenomenon
应用的实例: dynamic arrays, 因为new
太昂贵(because they typically result in calls to the underlying operating system, and system calls are generally slower than are in-process function calls).
- 分情况 lazy 与 over-eager 的使用
Lazy evaluation is a technique for improving the efficiency of programs when you must support operations whose results are not always needed.
Over-eager evaluation is a technique for improving the efficiency of programs when you must support operations whose results are almost always needed or whose results are often needed more than once.
Item 19. 了解临时对象的来源
- temporary 的概念
True temporary objects in C++ are invisible. They arise whenever a non-heap object is created but not named.temp
is not a temporary.
1 | template <class T> |
出现 temporary 的场景
when implicit type conversions are applied to make function calls succeed.
更具体地: These conversions occur only when passing objects by value or when passing to a reference-to-const parameter. They do not occur when passing an object to a reference-to-non-const parameter.when functions return objects.
how and why these temporary objects are created and destroyed
- 形参与实参类型不匹配
the call to countChar
finishes executing, the temporary object is automatically destroyed.
1 | // returns the number of occurrences of ch in str |
There are two general ways to eliminate it.
- redesign your code so conversions like these can’t take place. Item 5
- modify your software so that the conversions are unnecessary. Item 21
- return value
解决方式:
- return value optimization Item 20.
- 主动认识这种场景: Anytime you see a function returning an object, a temporary will be created (and later destroyed). Learn to look for such constructs, and your insight into the cost of “behind the scenes” compiler actions will markedly improve.
Item 20. 协助完成返回值优化
return-by-value 替代方式有各自的问题.
- return-by-pointer: 资源管理问题.
- return-by-reference: reference to local object ==> 空悬
只能 return-by-value 下的思路
channel your efforts into finding a way to reduce the cost of returned objects, not to eliminate the objects themselves. 策略有 2.
- 策略 1: The 1st trick is to return constructor arguments instead of objects
1 | const Rational operator*(const Rational &lhs, |
- 策略 2: declaring that function
inline
:
1 | inline const Rational operator*(const Rational &lhs, |
Item 21. 利用重载技术避免隐式类型转换
1 | class UPInt |
如果我们把所有隐式类型转换的参数组合都重载好的话就可以消除因为隐式类型转换造成的临时对象了(declaring several functions, each with a different set of parameter types).
1 | const UPInt operator+(const UPInt &lhs, const UPInt &rhs); |
但是要注意一点不能使用纯内置类型重载操作符.
1 | const UPInt operator+(int lhs, int rhs); // error! |
拓展部分
除了上面的 overloading 策略, 还可以通过范型编程中的 type trait 等技术来实现对参数类型的约束.
Item 22. 考虑以操作符复合形式代替独身形式
There is no relationship between operator+
, operator=
, and operator+=
.
- 独身形式可以用复合形式简单地实现 ==> 减少需要维护的 operator
1 | class Rational |
效率上复合形式更高.原因是独身形式需要临时对象,而复合形式则直接修改左侧值不需要临时变量.
in general, assignment versions of operators are more efficient than stand-alone versions, because stand-alone versions must typically return a new object, and that costs us the construction and destruction of a temporary.同时提供 2 种可以提高灵活性, 便于 client 选择.
By offering both options, you let clients develop and debug code using the easier-to-read stand-alone operators while still reserving the right to replace them with the more efficient assignment versions of the operators.
Furthermore, by implementing the stand-alones in terms of the assignment versions, you ensure that when clients switch from one to the other, the semantics of the operations remain constant.
1 | Rational a, b, c, d, result; |
- 返回值优化上
T(lhs)
is a call toT
’s copy constructor. It creates a temporary object whose value is the same as that oflhs
.
unnamed objects have historically been easier to eliminate than named objects.
1 | template <class T> |
- 结论: 尽可能使用复合形式
You should consider using assignment versions of operators instead of stand-alone versions whenever performance is at a premium.
Item 23. 考虑使用其他程序库
作者给出了理想库的特点, 很是经典: The ideal library is small, fast, powerful, flexible, extensible, intuitive, universally available, well supported, free of use restrictions, and bug-free. It is also nonexistent.
库的现状: 不同侧面的平衡
- Libraries optimized for size and speed are typically not portable.
- Libraries with rich functionality are rarely intuitive.
- Bug-free libraries are limited in scope.
因此不同的库有不同的特点(偏重): Different designers assign different priorities to these criteria. They thus sacrifice different things in their designs. As a result, it is not uncommon for two libraries offering similar functionality to have quite different performance profiles.
例如 iostream 有类型安全, extensible 的优点, stdio 有 efficiency 的优点.
不要迷信 benchmarks:
It can provide some insight into the comparative performance of different approaches to a problem.
Item 24.了解 virtual functions, multiple inheritance, virtual base classes, and RTTI 的成本
前提注意: 对于 virtual functions, 不同的实现有不同的做法, 结果是 the implementation of some features can have a noticeable impact on the size of objects and the speed at which member functions execute.
虚函数表 virtual tables(vtbls)的存储.
The size of a class’s vtbl is proportional to the number of virtual functions declared for that class (including those it inherits from its base classes).
There should be only one virtual table per class. 结构图解如下:
Q: How could compilers then know which vtbls they were supposed to create?
A: 有两种放法:
暴力式, 所有需要 vtbl 的目标文件都产生 vbtl 的副本, 链接器再剥离重复副本.
探勘式, 一般是放在内含第一个 non-inline, non-pure 虚函数定义式的目标文件中, 不要对纯虚函数声明
inline
的原因之一在于此.
If all virtual functions in a class are declared inline
, the heuristic fails, and most euristic-based implementations then generate a copy of the class’s vtbl in every object file that uses it.
虚函数表指针 virtual table pointer (vptr)的维护(an extra pointer inside each object)
some way of indicating which vtbl corresponds to each object ==> vptr(a hidden data member)
位置: 放在成员变量一起, 如下图(一种可能性):
区区一个指针影响有多大?
- 对小 class: If your objects contain, on average, four bytes of member data, for example, the addition of a vptr can double their size (assuming four bytes are devoted to the vptr). On systems with limited memory, this means the number of objects you can create is reduced.
- 对于较大的 class: larger objects mean fewer fit on each cache or virtual memory page, and that means your paging activity will probably increase.
vtbl 与 vtpr 关系如下图:
通过 vtpr 调用 vtbl 中的函数的成本(影响不大)
virtual function 具体的调用过程如下:
- Follow the object’s vptr to its vtbl.
- Find the pointer in the vtbl that corresponds to the function being called. The cost of this step is just an offset into the vtbl array.
- Invoke the function pointed to by the pointer located in step 2.
1 | pC1->f1(); |
The cost of calling a virtual function is thus basically the same as that of calling a function through a function pointer.
无法使用 inline
成优化天造花板降低(you effectively give up inlining)
virtual 的本来意思就是不确定运行时确定, 与 inline
在编译时确定的本质上矛盾.
That’s because “inline” means “during compilation, replace the call site with the body of the called function,” but “virtual” means “wait until runtime to see which function is called.”
多重继承让 vtbl 与 vtpr 变得更复杂, 随着而来的是 cost 的增加
offset calculations to find vptrs within objects become more complicated;
there are multiple vptrs within a single object (oneper base class);
and special vtbls must be generated for base classes in addition to the stand-alone vtbls we have iscussed.
多重继承会涉及到 virtual base classes, 避免 common base 内存模型的多次 copy. 导致的结果是 use pointers to virtual base class parts as the means for avoiding the replication, and one or more of those pointers may be stored inside your objects. 菱形继承与其内存模型如下图所示(一种可能的结构):
注意不同的实现有不同的策略: Some implementations add fewer pointers, and some find ways to add none at all. (Such implementations make the vptr and vtbl serve double duty).
更复杂的情况:
virtual base class + virtual functions 时的内存结构图:
RTTI(runtime type identification) 的成本 type_info
object for each class
The language specification states that we’re guaranteed accurate information on an object’s dynamic type only if that type has at least one virtual function.
RTTI was designed to be implementable in terms of a class’s vtbl.
We need only one copy of the information per class, and we need a way to get to the appropriate information from any object containing a virtual function. 很自然地想到把 type_info
对象放在 vtbl 中.
成本总结表
一些人会说 C++ 的动态多态有成本, 那我干脆不用算了. 作者认为需要实现的功能放在那里, 手工写的代码很大概率不如编译器生成的代码(准确性, 效率), 因此还是要该用的时候用.
But remember that each of these features offers functionality you’d otherwise have to code by hand. In most cases, your manual approximation would probably be less efficient and less robust than the compiler-generated code. Using nested switch
statements or cascading if-then-else
s to emulate virtual function calls, for example, yields more code than virtual function calls do, and the code runs more slowly, too. Furthermore, you must manually track object types yourself, which means your objects carry around type tags of their own; you thus often fail to gain even the benefit of smaller objects.
技术(techniques,idioms,patterns)
作者试图向读者解释: It should also convince you that no matter what you want to do, there is almost certainly a way to do it in C++.
Item 25. 将 constructor 和 non-member function 虚化
virtual constructor
定义: A virtual constructor is a function that creates different types of objects depending on the input it is given.
如下继承关系中
1 | class NLComponent//abstract base class for newsletter components contains at least one |
假设一个 NLComponent
通过一个 istream
的输入动态地去初始化 sub-object(指向其子类的指针). 其构造函数如下, 即为一个 virtual constructor.
1 | class NewsLetter |
- 一种特殊的 virtual constructor: virtual copy constructor, 一般以
clone
,copyself
,conleself
等命名.
下面的 virtual copy constructor 是基于类本身的 copy constructor(TextBlock(*this)
) 而定义出来的, 与 copy constructor 有 consistency, 例如 shallow copy/deep copy, reference counting, copy-on-write 等方面保持一致.
1 | class NLComponent |
virtual copy constructor 的用途: makes NLComponent
easy to implement a (normal) copy constructor for NewsLetter
.
1 | class NewsLetter |
non-member functions
思想: virtual-acting non-member functions ==> write virtual functions to do the work, then write a non-virtual function that does nothing but call the virtual function. To avoid incurring the cost of a function call for this syntactic sleight-of-hand, of course, you inline
the non-virtual function.
在 non-virtual non-member functions 里调用 virtual member function, 并且 inline
之.
1 | class NLComponent |
直接进行 virtual 化的话, 有时候不让人满意:
1 | class NLComponent |
- 延伸: multi-method
you may wonder if it’s possible to make them act virtually on more than one of their arguments. It is, but it’s not easy. How hard is it? Turn to Item 31.
Item 26. 限制某个 class 所能产生的对象数量
限制产生对象 $n = 0$
declare the constructors of that class private
.
限制产生一个对象 $n = 1$
- 通过 global friend 函数访问 private 构造函数产生一个 static 对象.
1 | class Printer { |
- 希望限制 global scope, 可以使用 staic 成员函数实现, 当然也可以用 namespace 进行限制:
1 | class Printer { |
namespace 下的 function static 实现:
1 | namespace PrintingStuff |
对 function 以及 class 内定义的 staic 对象在下面 2 方面不同:
lazy of eager:
函数中的 static 对象是 lazy 式的. class 内的 static 成员对象是 eager 式的.
An object that’s static in a class is, for all intents and purposes, always constructed (and destructed), even if it’s never used. In contrast, an object that’s static in a function is created the first time through the function, so if the function is never called, the object is never created.静态变量初始化的依赖问题
We know exactly when a function static is initialized: the first time through the function at the point where the static is defined.
The situation with a class static is less well defined.
linkage 问题:
default linkage of inline
functions is external linkage. 因此可以把带 local static object 的 function 声明为 inline
.
- 在类内用一个构造数目记录对象数目从而限制对象数目的方法不可行. 原因:
- 继承它的子类也会调用它的构造函数造成数目记录不可控.
- 包含它的对象也会去调用它的构造函数.
限制产生 N 个对象 $n=N$
- simply count the number of objects in existence and throw an exception in a constructor if too many objects are requested.
1 | class Printer |
存在的问题
- 继承 concrete classes 时:
1 | class ColorPrinter : public Printer |
PS. Designs that avoid having concrete classes inherit from other concrete classes do not suffer from this problem Item 33.
- when
Printer
objects are contained inside other objects.
1 | class CPFMachine |
问题总结: The problem is that Printer
objects can exist in three different contexts:
- on their own,
- as base class parts of more derived objects, and
- embedded inside larger objects.
如果是 class local static object 的话就没有这些问题.
preventing derivation 禁止派生
目的: allow any number of FSA
objects to be created, but you’d also like to ensure that no class ever inherits from FSA
.
思路: 使用 pseudo-constructors.
1 | class FSA |
Allowing Objects to Come and Go
如何 phoneix 单例? 其效果如下:
1 | create Printer object p1; |
实现思路:
1 | class Printer |
可以调整单例为 N 例:
1 | class Printer |
An Object-Counting Base Class
进一步抽象为类: encapsulate the notion of counting instances and bundle it into a class.
使用 template class, 然后继承之.
注意点:
- move that variable into an instance-counting class.
- make sure that each class for which we’re counting instances has a separate counter.
1 | template <class BeingCounted> |
modify the Printer
class to use the Counted
template:
1 | class Printer : private Counted<Printer>//private inheritance |
使用 private inheritance 的原因:
- implementation details are best kept private(keep track of how many
Printer
objects exist is). - public inheritance ==> give the
Counted
classes a virtual destructor(Otherwise we’d risk incorrect behavior if somebody deleted aPrinter
object through aCounted<Printer>*
pointer). ==> affect the size and layout of objects of classes inheriting fromCounted
.
使用了 private inheritance 后续结果的应对:
- To restore the public accessibility of
objectCount
function, we employ ausing
declaration.
同时注意没有对增减 counting 的检查: No checking of the number of objects to see if the limit is about to be exceeded, no incrementing the number of objects in existence once the constructor is done. All that is now handled by the Counted<Printer>
constructors.
《more effective C++》-Item 1-26-学习笔记 1