Notes 6 --- Complexity
Fundamentals:
Complexity is:
- A means of comparing the characteristics and performance of data structures and algorithms.
- The most obvious way to make this comparison is to count the actual number of instructions executed (or the
number of pointers in a data structure, etc.). This has several problems:
- For one thing, it's often hard or tricky to do so.
- It relies on details of the coding, and on the level at which the analysis occurs (see below), and not the basic
ideas.
- Just counting the number of instructions gives the same weight to instructions of different cost, but counting
cycles introduces a dependence on the compiler and architecture which is unintended and probably inappropriate.
- We want to compare efficiencies for inputs of different sizes, and the patterns may not emerge immediately
from numbers.
The solution is:
- To count only the most frequent or most costly operations.
- To look only for the "leading term" of the cost function.
- To report the result as parameterized by the size of the input.
- For algorithms which manipulate files, the operations of interest are usually those which result in a secondary
memory access.
- For file structures, the number of accesses required to reach a given element or set of elements.
- Since choices between algorithms whose running time differs by (even approximately) a small constant factor
will often be made on a different basis, and because it makes the analysis much easier, we will also disregard the
coefficient of the leading term.
There are at least five measures of complexity of interest:
- best-case
--- how fast can the algorithm be?
- average-case
--- over the space of all inputs, how fast is it on average?
- worst-case
--- how bad can the algorithm be?
- amortized
--- over a sequence of operations, how bad can the algorithm be?
- real-world
--- for a "typical" set of inputs, how fast is it on average?
Amortized complexity is beyond the scope of this course, and determining the typical input set is subjective. We
will concentrate on worst-case, and to a lesser extent, average-case, analysis.
Complexity notation:
We will later discuss in some detail determining the complexity of an algorithm, and will in the meantime, as we
have already done, use complexity notation. What we need for now, however, is a means of comparing complexity
functions, or in particular, comparing the complexity of a given algorithm with a standard complexity class.
"Standard" complexity classes include O (f(n)) (for the following functions f of n):
- the class of constants, 1
- powers n^r for rational r
- exponentials a^n, and double exponentials a^b^n.
- lg n,
and n^r lg n, and likewise for (lg n)^s. (lg is log base 2.)
- n^lg n, n^n, n!
We can simplify the definition of complexity as follows: One function f has complexity less than or equal
to another function g if, for large enough values of n, f(n) is always less than or equal to some
constant multiple of g(n).
A few results:
(For convenience, these results are stated in terms of absolute inequalities, but should be translated into statements
about O-classes.)
- n^r < = n^r lg^s n < = n^(r+t) for r, s, t positive
- a^n b^n for 1 < a < = b
- n^r < = n^lg n < = a^n for any r and a > 1
- a^n < = n! < = n^n < = a^b^n for any 1 < a, b
A typical complexity analysis:
Consider the following program fragment:
for i := 1 to n do
for j := 1 to i do
sum := sum + A [i,j];
- The body of the loop contains a constant number of statements or operations (another distinction usefully
blurred by ignoring constants), and the loop headers contribute at most a constant amount of work in each
iteration. Thus number of operations = O ( number of iterations). But the inner loop is executed i
times for each i, and so in total
3_{ i = 1 }^n i times, or n (n+1)/2, or O ( n^2 ).