图灵机模型(Turing machine): A Turing machine is a simple (abstract) device that can read from, write to, and move along an infinitely long strip of paper. The actual behavior of the machines varies. Each is a so-called finite state machine: it has a finite set of states (some of which indicate that it has finished), and every symbol it reads potentially triggers reading and/or writing and switching to a different state. You can think of this machinery as a set of rules. (“If I am in state 4 and see an X, I move one step to the left, write a Y, and switch to state 9.”)

RAM模型(random-access machine):标准的单核计算机,它大致有下面三个性质

• We don’t have access to any form of concurrent execution; the machine simply executes one instruction after the other.


• Standard, basic operations (such as arithmetic, comparisons, and memory access) all take constant (although possibly different) amounts of time. There are no more complicated basic operations (such as sorting).


• One computer word (the size of a value that we can work with in constant time) is not unlimited but is big enough to address all the memory locations used to represent our problem, plus an extra percentage for our variables.


算法的本质: An algorithm is a procedure, consisting of a finite set of steps (possibly including loops and conditionals) that solves a given problem in finite time.

the notion of running time complexity (as described in the next section) is based on knowing how big a problem instance is, and that size is simply the amount of memory needed to encode it.




算法导论介绍到,对于三个符号可以做如下理解:O = ≤,Ω = ≥, Θ = =


几种常见的运行时间以及算法实例 点击这里可以参考下wiki中的时间复杂度


(1)Tip 1: If possible, don’t worry about it.


(2)Tip 2: For timing things, use timeit.


(3)Tip 3: To find bottlenecks, use a profiler.

使用cProfile模块来获取更多的关于运行情况的内容,从而可以发现问题的瓶颈,如果系统没有cProfile模块,可以使用profile模块代替,关于这两者的更多内容可以查看Python standard library-Python Profilers

(4)Tip 4: Plot your results.


(5)Tip 5: Be careful when drawing conclusions based on timing comparisons.


First, any differences you observe may be because of random variations.


Second, there are issues when comparing averages.


At the very least, you should stick to comparing averages of actual timings. A common practice to get more meaningful numbers when performing timing experiments is to normalize the running time of each program, dividing it by the running time of some standard, simple algorithm. This can indeed be useful but can in some cases make your results less than meaningful. See the paper “How not to lie with statistics: The correct way to summarize benchmark results” by Fleming and Wallace for a few pointers. For some other perspectives, you could read Bast and Weber’s “Don’t compare averages,” or the more recent paper by Citron et al., “The harmonic or geometric mean: does it really matter?”

Third, your conclusions may not generalize.


(6)Tip 6: Be careful when drawing conclusions about asymptotics from experiments.

在对从实验中得到关于渐近时间的信息下结论时需要小心,实验只是对于理论的一个支撑,可以通过实验来推翻一个渐近时间结果的假设,但是反过来一般不行 [以下是作者的解释]

If you want to say something conclusively about the asymptotic behavior of an algorithm, you need to analyze it, as described earlier in this chapter. Experiments can give you hints, but they are by their nature finite, and asymptotics deal with what happens for arbitrarily large data sizes. On the other hand, unless you’re working in theoretical computer science, the purpose of asymptotic analysis is to say something about the behavior of the algorithm when implemented and run on actual problem instances, meaning that experiments should be relevant.


Python中很多地方都使用了hash策略,在前面的Python数据结构篇中的搜索部分已经介绍了hash的内容。Python提供了hash函数,例如hash("Hello, world!")得到-943387004357456228 (结果不一定相同)。Python中的dict和set都使用了hash机制,所以平均情况下它们获取元素都是常数时间的。

(1)图的表示:最常用的两种表示方式是邻接表和邻接矩阵 [假设要表示的图如下]

邻接表 Adjacency Lists:因为历史原因,邻接表往往都是指链表list,但实际上也可以是其他的,例如在python中也可以是set或者dict,不同的表示方式有各自的优缺点,它们判断节点的连接关系和节点的度的方式甚至两个操作的性能都不太一样。

① adjacency lists 表示形式

② adjacency sets 表示形式

基本上和adjacency lists表示形式一样对吧?但是,对于list,判断一个元素是否存在是线性时间O(N(v)),而在set中是常数时间O(1),所以对于稠密图使用adjacency sets要更加高效。

③ adjacency dicts 表示形式



邻接矩阵 Adjacency Matrix




(2)树的表示 [假设要表示下面的树]




[Bunch Pattern]:有意思的是,上面的实现方式使用了Python中一种常用的设计模式,叫做Bunch Pattern,貌似来自经典书籍Python Cookbook,原书介绍如下:


When prototyping (or even finalizing) data structures such as trees, it can be useful to have a flexible class that will allow you to specify arbitrary attributes in the constructor. In these cases, the “Bunch” pattern (named by Alex Martelli in the Python Cookbook) can come in handy. There are many ways of implementing it, but the gist of it is the following:

There are several useful aspects to this pattern. First, it lets you create and set arbitrary attributes by supplying them as command-line arguments:

Second, by subclassing dict, you get lots of functionality for free, such as iterating over the keys/attributes or easily checking whether an attribute is present. Here’s an example:

This pattern isn’t useful only when building trees, of course. You could use it for any situation where you’d want a flexible object whose attributes you could set in the constructor.


• NetworkX:

• python-graph:

• Graphine:

• Pygr: a graph database

• Gato: a graph animation toolbox

• PADS: a collection of graph algorithms


In general, the more important your program, the more you should mistrust such black boxes and seek to find out what’s going on under the cover.


(1)Hidden Squares 隐藏的平方运行时间


(2)The Trouble with Floats 精度带来的烦恼





更多和Python中的浮点数有关的内容可以查看Floating Point Arithmetic: Issues and Limitations

问题2-12. (图的表示)

Consider the following graph representation: you use a dictionary and let each key be a pair (tuple) of two nodes, with the corresponding value set to the edge weight. For example W[u, v] = 42. What would be the advantages and disadvantages of this representation? Could you supplement it to mitigate the downsides?

The advantages and disadvantages depend on what you’re using it for. It works well for looking up edge weights efficiently but less well for iterating over the graph’s nodes or a node’s neighbors, for example. You could improve that part by using some extra structures (for example, a global list of nodes, if that’s what you need or a simple adjacency list structure, if that’s required).

1 3 收藏 评论